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Abstract 

The  goal  of  this  thesis  was  to  study  the  performance  of  three  commercial  object- 
oriented  database  management  systems.  The  commercial  systems  studied  included:  Itasca, 
sold  by  Itasca  Systems  Incorporated;  Matisse,  sold  by  Intellitic  International;  and  Object- 
Store,  sold  by  Object  Design  Incorporated.  To  examine  performance  of  these  database 
management  systems  two  benchmarks  were  run:  the  001  benchmark  and  a  new  AFFT  Sim¬ 
ulation  benchmark.  The  001  benchmark  was  designed,  implemented,  and  run  on  all  three 
database  management  systems.  ObjectStore  was  our  top  performer  on  all  configurations 
of  the  001  benchmark.  The  AFIT  Simulation  benchmark  was  designed,  implemented, 
and  run  on  the  ObjectStore  database  management  system.  A  non-persistent  version  of 
the  benchmark  was  also  created  in  the  C-(-l-  programming  language.  There  was  minitnal 
performance  overhead  incurred  due  to  the  use  of  ObjectStore,  especially  when  compared 
to  the  functional  benefits  gained.  We  concluded  that  there  are  major  differences  between 
the  performance  levels  offered  in  current  commercial  object-oriented  database  management 
systems.  We  also  concluded  that  a  programming  language  interface  to  an  object-oriented 
database  management  system  should  not  be  middle  ground.  Either  it  should  be  closely 
tied  to  a  specific  language  or  not  tied  to  a  specific  language  at  all. 
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PERFORMANCE  MEASUREMENT  OF  THREE  COMMERCIAL 


OBJECT-ORIENTED  DATABASE 
MANAGEMENT  SYSTEMS 

/.  Introduction 

1.1  Background 

The  next  generation  of  database  management  systems,  object-oriented  database  man¬ 
agement  systems,  is  starting  to  arrive  on  the  market  today.  The  various  implementations 
of  these  systems  are  incredibly  diverse,  much  more  diverse  than  relational  database  sys¬ 
tems  that  have  been  dominant  since  the  mid-lQSOs.  All  relational  database  management 
systems  are  based  on  the  relational  data  model.  The  relational  data  model  is  uniformly 
used  in  different  database  implementations  and  is  jSrmly  based  in  mathematics  (set  theory 
and  first  order  predicate  logic)  [12].  In  contrast,  the  emerging  object-oriented  data  model 
is  neither  uniformly  implemented  nor  firmly  based  in  mathematics.  In  fact,  exactly  what 
traits  are  required  for  a  database  to  claim  that  it  is  object-oriented  is  still  under  some 
debate  [2,  37].  This  lade  of  agreement  has  created  diversity  in  the  capabilities  of  today’s 
object-oriented  database  management  systems. 

Despite  this  diversity,  object-oriented  DBMSs  are  indeed  useful  today.  They  are 
bringing  DBMS  functionality  to  applications  which  traditionally  have  used  only  custom 
file-based  storage  systems.  Engineering  applications,  such  as  Computer  Aided  Design 
(CAD)  and  Computer  Aided  Publishing  (CAP),  and  also  computer  simulation,  have  not 
widely  used  existing  commercial  DBMSs  for  several  reasons.  The  most  critical  reason  is 
performance.  Interactive  engineering  applications  require  database  systems  which  are  ten 
to  one  hundred  times  faster  than  traditional  DBMSs  [9].  Some  object-oriented  DBMSs 
can  provide  this  much-needed  level  of  performance. 

Because  performance  is  a  critical  requirement,  it  is  imperative  to  be  able  to  measure 
the  performwee  of  object-oriented  DBMSs.  It  is  also  necessary  to  foctis  performance 
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measurement  on  those  object-oriented  DBMS  services  which  are  the  most  critical.  To 
identify  which  services  are  the  most  critical,  the  applications  which  use  object-oriented 
DBMSs  must  be  investigated.  Once  critical  services  have  been  identified,  a  benchmark  can 
be  used  as  a  tool  to  measmre  object-oriented  DBMS  performance. 

A  benchmark  is  a  program  used  to  quantitatively  measure  the  performance  of  any 
software  or  hardware  system.  Gray  notes  that  a  benchmark  can  be  thought  of  as  a  workload 
[14:1].  The  hardware  or  software  on  which  the  benchmark  is  run  is  called  the  system  under 
test.  The  performance  measure  of  a  system  under  test  on  a  benchmark  could  be  a  time  to 
completion  (seconds)  or  a  throughput  metric  (work/seconds).  The  performance  measiu'e- 
ment  can  be  combined  with  the  price  of  the  system  under  test  to  give  a  price/performance 
ratio. 

Good  performance  of  a  system  under  test  on  a  benchmark  does  not  indicate  that 
the  system  will  perform  well  on  every  type  of  application.  The  benchmark  is  only  a 
valid  yardstick  for  applications  which  are  similar  to  the  benchmark.  For  example,  if  you 
are  planning  to  build  a  software  system  which  performs  a  great  deal  of  floating-point 
operations,  then  examining  the  results  of  floating-point  operations  benchmarks  can  aid  in 
your  selection  of  a  computer  system.  But  if  you  are  planning  to  do  word  processing  on  the 
computer  system,  then  the  results  of  floating-point  operations  benchmarks  will  be  useless. 

Benchmarks  allow  comparison  of  different  systems  for  an  application  without  actually 
having  to  build  the  application  on  all  the  different  systems  under  consideration.  When  an 
application  is  going  to  be  very  large,  it  is  often  impossible  to  build  the  complete  application 
to  test  different  systems.  Therefore,  an  important  property  of  benchmarks  is  that  they  be 
small,  and  thus  reasonably  simple  to  implement.  To  date,  there  are  four  existing  object- 
oriented  DBMSs  benchmarks: 

•  Simple  Database  Operations 

•  HyperModel 

•  Object  Operations  Version  1  (001) 

•  007 
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The  Simple  Database  Operations  benchmark  uses  a  database  of  authors  and  books  to 
measure  performance  on  “simple,  object-oriented  queries  that  engineering  database  ap¬ 
plications  perform”  [35:387].  The  HyperModel  benchmark  is  a  more  complex  version  of 
the  Simple  Database  Operations  benchmark  [15].  The  001  benchmark  is  a  simpler,  more 
focused,  version  of  the  Simple  Database  Operations  benchmark  [9].  The  007  benchmark 
is  a  new  benchmark  developed  at  the  University  of  Wisconsin  which  attempts  to  be  a  more 
complete  measiire  of  database  performance  than  the  001  benchmau'k  [7]. 

1.2  Problem  Statement 

The  problem  was  that  it  was  not  known  if  the  performance  of  the  object-oriented 
DBMSs  available  at  AFIT  was  good  enough  to  be  used  for  research  applications,  especially 
research  in  computer  simulation.  To  determine  this,  it  was  necessary  to  benchmark  the 
performance  of  the  three  commercial  object-oriented  DBMSs  available  at  AFIT.  This  prob¬ 
lem  was  complicated  by  the  diversity  of  object-oriented  DBMS  interfaces,  the  complexity 
of  the  object-oriented  DBMSs,  the  wide  variety  of  applications  to  which  object-oriented 
DBMSs  can  be  applied,  and  the  question  of  what  specific  services  simulation  applications 
require. 

1.3  Objectives 

The  primary  objective  of  this  thesis  was  to  measure  the  performance  and  functionality 
of  the  three  commercial  object-oriented  DBMS  available  at  AFIT. 

A  second  objective  was  to  create  a  new  or  extended  benchmark  for  simulation  appli¬ 
cations.  This  was  necessary  to  provide  simulation  research  projects  a  yardstick  to  evaluate 
if  jmy  of  the  object-oriented  DBMS  available  at  AFIT  would  be  useful  to  them. 

1.4  Methodology 

This  research  was  conducted  in  four  stages.  In  each  stage  the  following  three  com¬ 
mercial  object-oriented  DBMSs  were  tested: 

•  Itasca 
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•  Matisse 


•  ObjectStore 

These  three  object-oriented  DBMSs  were  selected  because  we  believe  that  they  represent 
a  reasonable  cross-section  of  commercial  object-oriented  DBMS  industry. 

Itasca,  Matisse,  and  ObjectStore  were  used  in  the  following  four  stages  of  research: 

1.  Functional  comparison  of  the  three  commercial  object-oriented  DBMSs. 

2.  Rimning  the  001  benchmark  on  the  three  commercial  object-oriented  DBMSs. 

3.  Creating  a  specification  for  an  object-oriented  DBMS  benchmark  for  the  simulation 
domain. 

4.  Running  the  simulation  benchmark  on  the  three  commercial  object-oriented  DBMSs. 

During  stage  one  we  investigated  the  fimctional  capabilities  of  the  Itasca,  Matisse, 
and  ObjectStore  DBMSs.  The  goal  was  to  determine  the  functional  differences  between  the 
databases.  Knowing  the  functional  differences  between  the  databases  aided  our  analysis  of 
the  benchmark  results  obtained  later  in  this  research.  A  secondary  reason  for  investigating 
the  functional  capabilities  of  the  three  commercial  object-oriented  DBMSs  was  to  be  able 
to  gain  enough  practical  knowledge  about  the  databases  to  implement  the  benchmarks 
which  were  constructed  in  the  following  phases. 

During  stage  two  we  created  an  implementation  of  the  001  benchmark  for  the  Itasca, 
Matisse,  and  ObjectStore  DBMSs.  To  investigate  the  performance  of  Itasca,  Matisse,  and 
ObjectStore,  we  first  wanted  to  investigate  their  performance  on  a  standard,  well  defined 
benchmark.  We  selected  the  001  benchmark  for  the  following  reasons: 

•  Maturity.  The  OOl  benchmark  is  the  most  mature  and  completely  specified  of  all 
the  object-oriented  DBMS  benchmarks.  The  benchmark  evolved  fix>m  an  earlier 
benchmark. 

•  Industry  Acceptance:  The  OOl  benchmark  has  wide  acceptance  from  vendors  in  the 
object-oriented  DBMS  industry,  or  at  least  we  felt  that  was  true  at  the  start  of  this 
research. 
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•  Reasonable  Implementation  Effort:  To  accomplish  the  implementation  of  a  standard 
benchmark,  and  a  new  benchmark  in  the  simulation  domain,  the  stand2U'd  benchmark 
must  not  require  an  unreasonable  amount  of  time  to  implement. 

Stage  three  involved  creating  a  specification  for  a  benchm2urk  which  would  execute  a 
computer  simulation  environment  inside  an  object-oriented  DBMS.  We  examined  litera¬ 
ture  about  DBMS  benchmarks  and  current  simulations  to  define  a  simple  simulation.  The 
benchmark  was  to  be  qualitative  as  well  as  quantitative.  The  benchmark  investigated  the 
ability  of  the  three  commercial  object-oriented  DBMSs  to  support  computer  simulation. 

Stage  four  implemented  the  simulation  defined  in  the  previous  stage.  A  complete 
implementation  of  the  simulation  benchm2U'k  was  created  for  the  Itasca,  Matisse,  and 
ObjectStore  DBMSs. 

1.5  Materials  and  Equipment 

For  this  research,  two  Sun  SPARCstation  2  workstations  were  set  up  as  test  comput¬ 
ers.  The  SPARCstation  2  is  a  general  purpose  engineering  workstation.  One  workstation, 
prowler,  acted  as  the  database  server,  and  the  other  workstation,  doc,  as  the  client.  The 
databzue  server  was  configured  with  two  additionzd  disk  drives,  one  to  hold  the  DBMS 
software,  and  the  other  to  hold  the  test  databases.  The  benchmark  nms  were  run  during 
the  evening  to  avoid  heavy  network  traffic  during  tests. 

For  this  research  we  used  version  2.2  of  the  Itasca  object-oriented  DBMS.  Itasca 
was  originally  developed  as  the  Orion  database  system  by  Microelectronics  and  Computer 
Technology  Corporation.  The  Orion  database  was  enhanced  and  is  now  sold  by  Itasca 
Systems  Incorporated  of  Miimeapolis,  Minnesota.  We  used  version  2.2.0  of  the  Matisse 
object-oriented  DBMS.  Matisse  was  developed  by  Intellitic  International  of  Prance.  We 
also  used  version  2.0.1  of  the  ObjectStore  object-oriented  DBMS.  ObjectStore  was  devel¬ 
oped  by  Object  Design  Incorporated  of  Burlington,  Massachusetts. 
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1.6  Sequence  of  Presentation 

In  Chapter  11  we  present  a  review  of  pertinent  literature  in  the  area  of  DBMS  bench- 
meurking  and  computer  simulation.  Chapter  III  investigates  the  functional  capabilities 
and  differences  between  the  Itasca,  Matisse,  and  ObjectStore  object-oriented  DBMSs.  In 
Chapter  IV  we  describe  the  emalysis,  design,  and  implementation  of  the  OOl  benchmark. 
We  also  examine  the  problems  encountered  when  working  with  the  three  object-oriented 
DBMSs.  Chapter  V  describes  the  simulation  benchmark  developed  for  this  research  and 
describes  oiu:  implementation  of  this  benchmark.  In  Chapter  VI  we  examine  our  results 
from  running  the  OOl  benchmark  and  the  simulation  benchmark,  and  in  Chapter  VII 
present  some  final  conclusions  and  recommendations. 


6 


II.  Literature  Review 


In  this  chapter  we  review  literature  about  DBMS  benchmarking  and  computer  simu¬ 
lation.  The  review  of  DBMS  benchmarks  examines  the  capabilities  of  existing  benchmarks. 
The  review  of  computer  simulation  examines  the  use  of  object-oriented  DBMSs  in  simula¬ 
tion  environments  today  and  the  capabilities  needed  by  simulation  environments. 

2.1  DBMS  Benchmarks 

DBMS  benchmarks  are  a  way  to  measure  the  performance  and/or  functionality  of 
a  DBMS.  They  can  also  be  used  to  find  the  lowest-cost  DBMS  and  computer  system 
for  a  required  job.  The  next  four  sections  survey  DBMS  benchmarks.  First,  the  criteria 
for  a  good  DBMS  benchmark  is  covered.  Second,  the  role  of  the  only  DBMS  benchmark 
standards  organization,  the  Transaction  Processing  Performance  Council,  is  examined. 
Then  eight  important  DBMS  benchmarks  are  looked  at.  For  each  benchmark  ^e  point 
out  the  important  strengths  and  weaknesses  of  the  benchmark.  Included  are  benchmarks 
for  on  line  transaction  processing  (OLTP),  relational,  and  object-oriented  DBMSs.  The 
following  items  about  each  benchmark  are  examined: 

•  The  benchmeirk  problem  domain 

•  The  benchmark  database 

•  The  benchmark  operations 

•  The  measmements  (or  results)  of  the  benchmark 

For  more  detailed  information  about  any  specific  benchmark,  the  source  documents  on 
that  benchmzirk  shoidd  be  examined. 

2.2  DBMS  Benchmark  Criteria 

DBMS  benchmarks  are  domain-specific  benchmarks.  These  benchmarks  attempt  to 
quantitatively  measure  the  performance  of  a  DBMS  in  a  specific  domain  area,  such  as 
decision  support  or  OLTP.  In  [14]  Gray  proposes  the  following  criteria  for  a  good  domain- 
specific  benchmark:  relevant,  portable,  scalable,  and  simple. 
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A  benchmcurk  must  be  relevant  to  be  a  useful  yardstick  for  DBMS  performance.  For 
example,  if  you  are  pieinning  to  use  a  DBMS  for  OLTP,  then  results  of  OLTP  benchmeurks 
cem  end  in  selection  of  a  DBMS  emd  a  computer  system.  But  if  you  eure  planning  to  use  the 
database  for  a  decision  support  system,  then  OLTP  benchmeirks  eure  not  useful  because 
they  eure  not  relevant  to  the  decision  support  domeiin. 

A  benchmeurk  must  be  portable  so  that  it  cem  be  nm  on  severed  different  DBMSs  and 
computer  systems.  Ideally,  a  benchmeu-k  should  be  able  to  be  nm  on  edl  the  DBMSs  which 
support  the  domain  (e.g.,  OLTP)  for  which  it  measures  performemce. 

A  benchmark  should  be  scalable  to  large  and  small  computer  systems.  As  the  ca¬ 
pabilities  of  the  computer  system  increase,  the  benchmark  should  “scale-up”  to  credibly 
meetsvire  the  performance  of  that  computer  system.  Gray  also  notes  that  a  benchmark 
should  scale-up  to  new  types  of  computer  systems  (e.g.  parallel  computer  systems)  as 
“computer  performance  and  architecture  evolve”  [14:5]. 

A  benchmark  should  be  simple  so  it  can  be  imderstood  and  easily  implemented.  If 
a  benchmark  is  as  complex  as  your  intended  application,  then  there  would  be  little  point 
to  using  the  benchmark  (your  application  could  be  used  to  measure  DBMS  performance). 
A  benchmeurk  must  be  a  small  and  simple  progreun  which  can  be  used  as  a  yardstick  to 
evaluate  a  system  tmder  test  (a  DBMS  and  a  computer  system). 

DBMS  benchmarks  are  not  perfect  and  can  be  abused  by  vendors.  Gray  sites  two  ma¬ 
jor  benchmark  abuses:  “Benchmark  Wars”  and  “Benchmarketing”  [14].  The  “Benchmark 
Wars”  occm  between  DBMS  vendors  trying  to  maintain  top  performance  on  a  specific 
benchmark.  If  one  vendor  loses  to  another,  the  losing  vendor  renms  the  benchmark  with 
better  “gurus.”  If  the  vendor  succeeds  in  getting  better  results,  the  other  vendor  does 
the  same  thing.  This  can  continue  to  the  point  were  modifications  are  being  made  to 
the  DBMS  software  specifically  to  make  the  benchmark  run  faster.  “Benchmarketing”  is 
where  a  benchmark  is  modified  (or  a  new  benchmark  is  created)  to  allow  a  specific  D  MS 
product  perform  extremely  well. 
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2.S  DBMS  Benchmark  Standards  Organizations 

The  Transaction  Processing  Performance  Council  (TPC)  is  the  only  existing  stan¬ 
dards  body  for  DBMS  benchmarks.  The  TPC  is  a  non-profit  corporation  founded  in 
August  1988.  The  mission  of  the  TPC  is  ‘^o  define  transaction  processing  and  database 
benchmarks  and  to  disseminate  objective,  verifiable  TPC  performance  data  to  the  indus¬ 
try”  [39:1].  The  TPC  was  created  because  of  the  lack  of  agreed  upon  benchmarks  for 
measming  DBMS  performance.  The  TPC  performs  the  following  services: 

•  Defines  Standard  Benchmarks:  The  TPC  has  created  three  standard  benchmarks  to 
date:  TPC- A,  TPC-B,  and  TPC-C  (which  will  be  examined  in  this  chapter),  and  has 
two  more  in  the  works:  TPC-D  (decision  support  domain)  and  TPC-E  (enterprise 
domain). 

•  Full  Disclosure  of  Results:  All  companies  which  claim  a  performance  measure  on 
a  TPC  benchmark  must  submit  a  detailed  report  to  the  TPC  (called  a  full  disclo¬ 
sure  report).  This  report  documents  the  benchmark’s  compliance  with  the  TPC 
benchmark  standard. 

•  Quarterly  Report:  The  TPC  publishes  a  quarterly  report  which  contains  summaries 
of  all  the  benchmark  results  published  that  quarter. 

The  TPC  has  41  current  members  which  includes  both  DBMS  software  and  computer 
hardware  vendors. 

2.4  Benchmarks  for  OLTP  and  Relational  DBMSs 

The  following  four  benchmarks  for  OLTP  and  relational  DBMSs  are  examined: 

•  TPC  Benchmark  A  (TPC- A) 

•  TPC  Benchmark  B  (TPC-B) 

•  TPC  Benchmark  C  (TPC-C) 

•  Wisconsin  Benchmark 
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The  first  three  are  the  standard  benchmarks  defined  by  the  TPC.  The  Wisconsin  bench¬ 
mark  is  a  benchmark  for  complex  relational  queries.  We  provide  detailed  explanations  of 
the  similar  TPC-A  and  TPC-B  benchmarks.  TPC-C  and  the  Wisconsin  benchmark  are 
not  covered  in  as  much  detail  due  to  their  complexity. 

2.4-1  TPC-A  and  TPC-B.  The  TPC-A  benchmark  was  developed  in  1989  by  the 
Transaction  Processing  Performance  Council;  TPC-B  was  developed  in  1990.  TPC-A  and 
TPC-B  use  the  same  database  and  transaction  profile,  ACID'  requirements,  and  costing 
formula.  The  major  difference  between  the  two  benchmarks  is  that  TPC-B  allows  the  use 
of  trsmsaction  generation  processes  to  create  transactions,  while  TPC-A  requires  the  use 
of  terminal  emulation  to  create  transactions.  The  TPC-A  benchmark  is  a  simple  OLTP 
benchmark,  while  the  TPC-B  benchmark  may  be  thought  of  as  a  database  stress  test. 

The  specifications  for  TPC-A  and  TPC-B  state  they  “exercise  the  system  components 
necessary  to  perform  tasks  associated  with  that  class  of  on-line  transaction  processing 
(OLTP)  environments  emphasizing  update-intensive  database  services”  [14].  Both  TPC-A 
and  TPC-B  are  defined  in  terms  of  a  banking  application.  The  bank  has  one  or  more 
branches  and  each  branch  has  multiple  tellers  (each  with  a  terminal  to  the  database). 
All  the  bank  customers  have  an  account.  The  final  metric  from  the  TPC-A  and  TPC-B 
benchmarks  is  throughput  as  measured  in  tremsactions  per  second. 

2.4. 1-1  Benchmark  Database.  The  database  consists  of  four  tables  (or  files) : 
Accoimt,  Branch,  Teller,  smd  History.  The  relationships  between  these  tables  is  shown  in 
Figure  1.  Figure  1  is  an  entity/relationship  diagram  for  the  database. 

The  Accoimt  table  contains  the  foUowing  fields; 

•  Account-ID  (The  key  for  the  table) 

•  Branch  JD  (The  branch  where  the  account  is  held) 

•  Account-Balance 

The  Branch  table  contains  the  following  fields: 

'Atomicity,  Consistency,  Isolation  (or  serializability),  and  Durability 
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Branch 


Figure  1.  TPC-A  and  TPC-B  Entity /Relationship  Diagram 

•  Branch-ID  (The  key  for  the  table) 

•  BrancK3alance 

The  Teller  table  contains  the  following  fields: 

•  Teller  JD  (The  key  for  the  table) 

•  BranchJD  (The  br2aich  where  the  teller  is  located) 

•  Teller.Balance 

The  History  table  contains  the  following  fields: 

•  AccfyunUD  (Updated  by  transaction) 

•  Teller^D  (Performed  the  transaction) 

•  Branch-ID  (Associated  with  teller) 

•  Amount 

•  Time-Stamp  (Time  of  the  transaction) 

The  benchmark  specification  requires  that  all  branches  must  have  the  same  number  of 
tellers  and  that  all  branches  mtist  have  the  same  number  of  accoimts.  The  ntimber  of  rows 
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BE6II  TEAISACTIOI 

Opdat*  Accout  vhcr*  Accout.ID  ■  Aid: 

&Md  Accout  .Balance  fro*  Accout 
Sat  Accout.Baluca  -  Accout.Baluca  *  Delta 
Hrite  Accout.Baluca  to  Accout 
Hrite  to  History : 

Aid,  Tid,  Bid,  Delta,  Tlae.ata^ 

Vpdate  Teller  where  Teller. ID  ■  Tid: 

Set  Teller.Baluce  ■  Taller.Baluce  *  Delta 
Hrite  Teller.Baluce  to  Teller 
Update  Bruch  where  Bruch.IO  «  Bid: 

Set  Bruch.Baluce  ■  Bruch.Baluce  *  Delta 
Hrite  Bruch.Baluce  to  Bruch 
COHMIT  TRAISACTIOI 


Figure  2.  TPC-A  and  TPC-B  Transaction  Profile 


in  each  table  is  not  a  fixed  value.  It  is  scaled  based  upon  the  throughput  rate  for  which 
the  test  is  configured. 


2.4’1-S  Benchmark  Operations.  Only  one  transaction  is  performed  on  the 
benchmark  database.  The  transaction  profile  is  shown  in  Figure  2.  Aid  (Accoimt  JD),  Tid 
(Teller  JD),  and  Bid  (Branch JD)  are  keys.  For  TPC-A,  the  Aid,  Tid,  Bid,  and  Delta  are 
read  from  a  teller  terminal  and  the  transaction  is  processed.  Then  Aid,  Tid,  Bid,  Delta, 
and  Account  JBalance  are  written  back  to  the  terminal.  For  TPC-B,  the  Aid,  Tid,  Bid,  and 
Delta  are  provided  by  a  driver,  and  only  Account3alance  is  returned  to  the  driver  after 
the  transaction  has  been  processed.  It  is  important  to  realize  that  the  TPC-A  benchmark 
measures  the  time  it  takes  messages  to  pass  through  the  communication  network  to  and 
from  the  teller  terminals  while  TPC-B  does  not. 


2.4. 1.3  Benchmark  Measurements.  TPC-A  and  TPC-B  provide  two  impor¬ 
tant  metrics:  a  tps  and  a  K$/tps.  The  tps  is  a  throughput  measurement  which  stands  for 
"transactions  per  second”.  To  avoid  confusion  with  other  (older)  similar  benchmarks  which 
create  a  tps  metric  (i.e.,  DebitCredit  and  TPl  [1]),  the  TPC-A  and  TPC-B  benchmarks 
prefix  the  tps  metrics.  TPC-A  uses  “tpsA-Local”  and  “tpsA-Wide.”  “tpsA-Local”  states 
the  test  was  run  using  a  local  communications  network,  and  "tpsA-Wide”  states  the  test 
was  run  using  a  wide  area  communications  network.  TPC-B  uses  "tpsB”  for  its  results. 
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What  is  required  to  generate  a  valid  tps  rating  for  TPC-A  and  TPC-B  is  not  obvious 
and  requires  some  explanation.  The  basic  calculation  is  simple:  to  obtain  a  tps  rating,  the 
number  of  transactions  which  started  and  completed  during  the  test  interval  is  divided  by 
the  elapsed  time  of  the  test.  But  in  order  for  the  tps  metric  to  be  a  valid  for  the  TPC-A 
or  TPC-B  benchmark,  several  requirements  must  be  met.  They  are  as  follows: 

•  The  database  table  sizes  must  be  scaled  properly 

•  The  test  interval  must  be  at  least  15  minutes  (and  no  longer  than  an  hour) 

•  90%  of  the  transactions  must  have  less  than  a  2  second  response  time  (to  the  terminal 
for  TPC-A,  to  the  driver  for  TPC-B) 

•  Each  terminal  (for  TPC-A)  creates  a  new  transaction  (on  average)  every  10  seconds 

First,  the  database  table  sizes  must  be  scaled  to  the  throughput  goal  of  the  test. 
TPC-A  and  TPC-B  are  scaled  based  upon  DBMS  throughput.  If  a  benchmark  test  is 
trying  to  measure  a  throughput  of  10  tps,  then  the  database  size  must  be  scaled  for  that 
level  of  throughput.  A  tps  throughput  measurement  is  only  allowed  to  be  as  high  as  the 
database  table  size  allows.  For  each  tps  configured,  the  benchmark  specification  states 
that  the  database  tables  must  have  the  following  number  of  rows: 


Thble 

Number  of  Rows 

Account 

TeUer 

Branch 

100,000  rows 
10  rows 

1  row 

In  addition  to  the  required  table  sizes,  for  TPC-A  there  must  be  10  terminals  for 
each  tps  configured. 

Second,  the  test  must  be  nm  in  a  steady  state  for  at  le^lst  a  time  of  15  minutes  and 
no  longer  than  one  hour,  but  the  test  system  must  have  enough  resources  to  nm  the  test 
for  a  totzJ  of  8  hours. 

Third,  90%  of  all  transactions  during  the  test  must  have  a  response  time  of  under  2 
seconds. 


Example:  Consider  a  TPC-A  test  system  configured  for  10  tps  using  a 
local  area  network.  To  allow  10  tps  the  test  database  and  computer  system 
must  use  a  minimum  of: 
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Item 

Number 

Account 

Teller 

Branch 

Terminals 

1,000,000  rows 
100  rows 
10  rows 
100 

time  is  1.85  (below  the  2  second  requirement).  During  the  20  minutes  suppose 
11,261  transactions  are  started  and  committed.  The  tps  rating  would  be  9.38 
tpsA-Local  Since  this  value  is  below  10  it  is  valid  tps  rating  for  TPC- 

A.  But  if  13,204  transactions  started  and  conunitted  during  the  test,  the  tps 
rating  of  11  tps  would  be  invalid.  This  is  because  11  tps  is  larger  than  the  10 
tps  throughput  for  which  the  test  system  was  configured.  □ 


The  second  measurement  from  TPC-A  and  TPC-B  is  the  KS/tps.  This  value  is  ob¬ 
tained  by  dividing  the  cost  of  the  system  by  the  measured  tps  rating.  The  benchmark 
standard  is  very  specific  about  what  items  are  to  be  included  in  the  cost  of  the  com¬ 
puter  system.  It  includes  cost  of  the  computer  hardware,  terminals,  communication  lines, 
database  software,  and  maintenance. 


Example:  In  the  TPC-A  test  above  (example  1),  assume  that  the  system 
under  test  costs  $140,000.  The  system  had  a  throughput  metric  of  9.38  tpsA- 
Local.  The  K$/tps  metric  would  be  14.9  K$/tps  (^)-O 

These  two  benchmarks  are  widely  used  by  DBMS  vendors  today.  And  TPC-A  and 
TPC-B  summary  results  are  regularly  published  in  computer  industry  literature. 

A  major  strength  of  the  TPC-A  and  TPC-B  benchmarks  is  their  simplicity.  These 
benchmarks  produce  very  simple  resiilts  (tps  measiurements).  Because  of  these  simple 
results  it  is  important  to  recognize  the  limitations  of  the  benchmarks.  TPC-A  is  a  useful 
yardstick  for  simple  OLTP  performance  capabilities  and  TPC-B  provides  a  simple  DBMS 
stress  test,  but  becatise  of  the  simple  transaction  used  in  both  benchmarks,  they  are  of 
absolutely  no  value  for  measuring  how  well  a  DBMS  will  perform  on  complex  queries. 


2-4-2  TPC-C-  The  TPC-C  benchmark  was  developed  in  1992  by  the  Transaction 
Processing  Performance  Coimcil.  This  benchmark  was  designed  to  simulate  an  OLTP 
workload.  It,  like  the  TPC-A  benchmark,  is  a  useful  yardstidk  for  OLTP  performance 
capabilities.  The  TPC-C  benchmark  is  mudi  more  complex  than  the  TPC-A  and  TPC-B 
benchmarks  (it  requires  111  pages  for  its  specification,  while  TPC-A  and  TPC-B  require  39 
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pages  and  38  pages,  respectively).  TPC-C  simulates  a  business  where  terminal  operators 
execute  transactions  against  a  database.  The  TPC-C  benchmark  is  specified  to  exercise 
the  following  components  of  an  OLTP  database  system  [38]: 

•  The  simultaneous  execution  of  multiple  transaction  types  that  span  a  breadth  of 
complexity. 

•  On-line  and  deferred  transaction  execution  modes 

•  Multiple  on-line  terminal  sessions 

•  Moderate  system  and  application  execution  time 

c 

•  Significant  disk  input/output 

•  Transaction  integrity  (ACID  properties) 

•  Non-uniform  distribution  of  data  access  through  primary  and  secondary  keys 

•  Databases  consisting  of  many  tables  with  a  wide  variety  of  sizes,  attributes,  and 
relationships 

•  Contention  on  data  access  and  update 

o 

2.4‘2.1  Benchmark  Database.  The  TPC-C  benchmark  database  represents 
a  wholesale  supplier  with  several  sales  districts.  The  supplier  has  warehouses  which  cover 
a  group  of  sales  districts.  Each  sales  district  has  a  group  of  customers.  For  the  TPC-C 
benchmark  the  following  rules  are  specified  for  the  benchmsurk  database: 

•  Each  regional  warehouse  covers  10  districts 

•  Each  district  serves  3,000  customers 

•  All  warehouses  maintain  stocks  for  the  100,000  items  sold  by  the  company 

•  The  database  size  is  scaled  by  adding  more  warehouses  (all  the  other  cardinalities 
are  fixed) 

The  benchmark  database  size  is  scaled  based  upon  the  throughput  of  the  DBMS  (like 
TPC-A  and  TPC-B). 
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2.4.2.2  Benchmark  Operations.  The  TPC-C  benchmark  operations  are 
based  around  the  types  of  transactions  which  would  be  typical  in  an  order-entry  environ¬ 
ment.  The  following  transactions  are  nm  on  the  TPC-C  database: 

1.  New- Order  Transaction:  This  transaction  enters  a  complete  order  in  a  single  database 
transaction. 

2.  Payment  Transaction:  This  transaction  updates  a  customers  balance  and  reflects  the 
payment  on  district  and  warehouse  sales  statistics 

3.  Order-Status  Transaction:  This  transaction  queries  the  status  of  a  customer’s  last 
order. 

4.  Delivery  Transaction:  This  transaction  processes  10  new  orders  (the  orders  are  de¬ 
livered). 

5.  Stock- Level  Transaction:  This  transaction  determines  the  number  of  items  that  have 
a  stock  level  below  a  threshold  level. 

All  of  these  transactions  are  executed  during  the  TPC-C  benchmark.  They  are  done  in 
the  frequency  which  would  be  expected  in  a  real  business. 

2. 4 -2. 3  Benchmark  Measurements.  The  final  metric  from  the  TPC-C 
benchmark  is  throughput  in  transactions  per  minute.  The  metric  is  called  “tpmC.”  As 
in  TPC-A  and  TPC-B,  the  reported  throughput  may  not  exceed  the  maximum  allowed  by 
the  database  size. 

The  complexity  of  the  TPC-C  benchmark  is  both  a  strength  and  a  weakness.  It 
is  a  strength  because  the  benchmark  is  more  realistic  (for  an  OLTP  application),  and  a 
weakness  because  it  makes  the  results  of  the  benchmark  more  difficult  to  interpret.  As 
with  all  the  TPC  benchmarks,  the  standardization  of  the  benchmark  is  a  strength.  The 
benchmark  leaves  little  flexibility  in  implementation  (so  it  is  less  likely  to  be  abused  by 
vendors).  A  weakness  of  this  benchmark  is  the  single  throughput  (tpmC)  which  is  generally 
reported  (in  summaries  of  results),  but  more  detuled  information  is  required  in  the  full 
disclosure  report. 
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2.4-3  The  Wisconain  Benchmark.  The  Wisconsin  Benchmark  was  developed  by 
Bitton,  DeWitt,  and  TurbyfiU  in  1983  [14].  This  benchmark  measures  DBMS  performance 
on  a  variety  of  complex  relational  queries.  32  queries  are  done  on  the  benchmark  database 
and  each  query  attempts  to  measure  DBMS  performance  on  one,  or  a  group  of,  basic 
relationzJ  operators  (i.e.,  selection,  projection,  or  join). 

2.4. 3.1  Benchmark  Database.  The  benchmark  database  consists  of  three 
tables.  The  first  table  contains  1,000  tuples  and  is  named  ONEKTUP.  The  other  two  tables 
contain  10,000  tuples  each  and  are  named  TENKTUPl  and  TENKTUP2.  The  fields  in  the  tables 
are  all  the  S2une  and  are  synthetically  generated  relations.  DeWitt  states  that  this  choice 
was  made  to  make  the  database  scalable  and  to  “permit  systematic  benchmarking”  [14:122]. 

2. 4. 3. 2  Benchmark  Operations.  The  benchmark  measures  performance  on 
the  following  types  of  queries: 

1.  Selection  Que  its 

2.  Join  Queries 

3.  Projection  Queries 

4.  Aggregate  Queries 

5.  Update  Queries 

There  are  a  total  of  32  queries  in  the  benchmeurk  specification. 

2.4-3. 3  Benchmark  Measurements,  For  ezu;h  of  the  32  queries  in  the  bench¬ 
mark,  elapsed  time  is  used  as  the  performance  metric.  This  is  the  wall  clock  time  firom 
when  the  query  was  started  until  it  completes. 

This  benchmark  had  a  major  impact  on  commercial  DBMSs  when  it  was  created 
(1983).  As  DeWitt  states,  “by  pointing  out  the  performance  warts  of  each  system,  vendors 
were  forced  to  significantly  improve  their  systems  in  order  to  remain  competitive”  [14:120]. 
For  example,  at  the  time  the  benchmark  w^ls  released,  nested  loops  was  the  only  join 
method  provided  by  the  ORACLE  and  IDM  500  DBMSs  [14].  DeWitt  reports  that  “ecich 
required  over  five  hours  to  execute”  one  of  the  bendunark  join  queries  [14:134]. 
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The  Wisconsin  benchmark  currently  is  being  used  to  evaluate  the  performance  of 
database  systems  running  on  parallel  processors  [14]. 

The  major  strength  of  this  benchmark  is  its  focus  on  query  performance.  However, 
one  has  to  have  some  education  in  relational  database  theory  to  understand  the  results.  If 
a  user  doesn’t  imderstand  how  a  selection  query  is  different  from  a  join  query,  the  results 
from  this  bendunzurk  will  not  be  useful.  But  for  domains,  such  as  decision  support,  where 
complex  queries  are  necessary,  this  benchmark  can  be  helpful  in  evaluating  the  performance 
of  a  DBMS. 

2.5  Benchmarks  for  Object-Oriented  DBMSs 

Though  performance  is  important  to  most  applications  which  could  use  object- 
oriented  DBMSs,  Cattell  maintains  that  little  work  has  been  done  in  the  area  [8].  Only  the 
following  fotu:  benchmarks  have  been  proposed  for  object-oriented  DBMSs  (none  of  which 
are  TPC  standards): 

•  Simple  Database  Operations  Benchmark 

•  Object  Operations  Version  1  (OOl)  Benchmark 

•  HyperModel  Benchmark 

•  007 

The  Simple  Database  Operations  benchmark  was  developed  first.  The  HyperModel  and 
OOl  benchmarks  are  both  based  on  the  Simple  Database  Operations  benchmark,  but 
HyperModel  is  a  more  complex  benchmark  than  001.  The  007  benchmark  is  a  new 
benchmark  created  at  the  University  of  Wisconsin  (the  creators  of  the  Wisconsin  Bench- 
meirk).  Figure  3  shows  the  evolution  of  object-oriented  DBMS  benchmarks.  Each  of  these 
benchmarks  is  examined  next. 

2.5.1  Simple  Database  Operations  Benchmark.  This  benchmark  was  proposed 
in  1987  by  Rubenstein,  Kubicar,  and  Cattell  [35].  They  created  the  benchmark  because 
existing  relational  DBMS  benchmarks  were  poor  measures  for  the  applications  they  were 
working  on.  They  needed  a  measure  of  '^response  time  for  simple  queries”  [35:387]. 
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Figure  3.  Evolution  of  Object-Oriented  DBMS  Benchmarks 

As  an  example,  consider  drawing  a  polygon  on  a  computer  screen  where  the  lines 
which  make  up  the  polygon  are  stored  in  the  DBMS.  The  program  would  start  by  querying 
the  database  for  the  first  line  in  the  polygon.  When  that  line  was  returned,  it  wotild  be 
drawn  on  the  screen.  This  process  would  be  repeated  for  each  line  in  the  polygon.  In  a 
complex  CAD  drawing  there  could  be  hundreds  of  thousands  of  lines.  This  is  the  type 
of  application  for  which  Rubenstein,  Kubicar,  and  Cattell  were  interested  in  providing 
a  benchmark,  but  they  used  a  more  comprehensible  database  of  docmnents  and  authors 
rather  than  polygons  for  their  benchmark. 

2. 5. 1.1  Benchmark  Database.  The  benchmark  uses  a  database  of  document 
and  person  records.  Doaunents  are  related  to  people  by  a  relationship  called  author.  5,000 
documents,  20,000  persons,  and  15,000  author  relationships  are  created  in  the  benchmark 
database.  To  allow  the  benchmark  to  "scale  up”  to  larger  databases,  the  benchmark 
proposes  the  same  meastirements  also  be  run  on  a  database  ten  times  and  one  himdred 
times  larger  [35:389]. 
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2.5. 1.2  Benchmark  Operations.  For  each  different  size  database,  the  per¬ 
formance  of  the  following  seven  different  operations  is  measured: 

1.  Name  Lookup:  Find  the  name  of  a  single  person. 

2.  Range  Lookup:  Find  the  names  of  people  with  birth  dates  in  a  partictilar  10-day 
period. 

3.  Group  Lookup:  Given  a  random  document,  find  all  authors  for  that  document. 

4.  Reference  Lookup:  Find  the  name  and  birth  date  for  a  single  author  of  a  random 
docmnent. 

5.  Record  Insert:  Create  a  new  author  record  and  add  it  to  the  database. 

6.  Sequential  Scan:  Retrieve,  one  at  a  time,  the  title  of  every  document  in  the  database. 

7.  Database  Open:  Perform  all  the  operations  necessary  to  make  the  DBMS  available 
to  nm  an  application  program. 

2.5. 1.3  Benchmark  Measurements.  For  each  of  the  benchmark  operations 
the  performance  measinrement  is  the  response  time  of  the  operation  [35].  The  response 
time  is  the  elapsed  time  from  when  the  operation  is  started  until  it  completes. 

The  scaling  of  the  benchmark  database  in  this  benchmark  is  limited  to  only  three 
sizes.  This  is  a  weakness  of  this  benchmark  and  most  object-oriented  DBMS  benchmzirks. 
Most  of  these  benchmarks  intend  to  measure  performance  of  the  object-oriented  DBMS 
when  the  entire  databeise  can  fit  in  memory,  zmd  then  when  it  cannot  fit  in  memory.  The 
main  strength  of  this  benchmark  is  that  it  is  quite  simple,  but  it  has  not  tmned  out  to  be 
very  popular.  It  has  edso  been  overshadowed  by  the  001  benchmsurk. 

2.5.2  HyperModel  Benchmark.  This  benchm:irk  was  proposed  in  1990  by  Berre 
md  Anderson  [15:75-91].  The  HyperModel  benchmark  is  a  very  complex  benchmark  for 
object-oriented  DBMSs  because  it  measiures  a  large  number  of  different  operations.  The 
creators  of  the  H3fperModel  benchmark  concluded  that  the  Simple  Database  Operations 
benchmark  did  not  measure  enough  database  operations  on  a  siifficiently  complex  database 
to  be  representative  of  a  wide  variety  of  engineering  applications  [15]. 
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2.5.2. 1  Benchmark  Database.  The  HyperModel  henrhmark  uses  a  data¬ 
base  which  represents  hypertext.  Hypertext  consists  of  nodes  and  links.  Nodes  contzdn 
information  such  as  text,  graphics,  or  sound.  Links  maintain  relationships  between  pieces 
of  information.  Berre  and  Anderson  state  that  “Hypertext  has  been  proposed  as  a  good 
model  for  use  in  Computer  Aided  Software  Engineering  (CASE)  because  it  is  possible  to 
store  software  and  documentation  as  hypertext”  [15:75]. 

2. 5.2.2  Benchmark  Operations.  The  following  operations  are  measured  in 
the  HyperModel  benchmark: 

1.  Name  Lookup:  A  lookup  it  performed  on  a  hypertext  node. 

2.  Range  Lookup:  A  range  of  hypertext  nodes  is  looked  up. 

3.  Group  Lookup  and  Reference  Lookup:  Same  as  the  Simple  Database  Operations 
benchmark,  but  it  is  extended  to  one-to-many,  many-to-many,  and  many-to-many 
with  attribute  relationships  as  well. 

4.  Sequential  Scan:  Same  as  the  Simple  Database  Operations  benchmark.  The  entire 
datab2ise  is  retrieved. 

5.  Closure  Traversal:  Starting  at  a  random  hypertext  node,  6nd  aU  the  nodes  transi¬ 
tively  re2u;hable  by  a  relationship.  This  is  done  for  all  the  types  of  relationships  in 
the  database. 

6.  Closure  Operations:  The  same  as  closure  traversal,  but  an  operation  will  be  per¬ 
formed  at  e2K:h  node  fotmd  during  the  traversal. 

7.  Editing:  This  operation  chmges  the  text  found  at  a  hypertext  node,  then  changes  it 
back  to  its  previous  value.  The  operation  is  also  done  for  a  hypertext  node  with  a 
graphics  image  (picture)  stored  in  it. 

8.  Create  and  Delete:  This  operation  creates  a  node  and  then  deletes  it. 

9.  Open  and  Close:  Same  as  Simple  Database  Operations  benchmark,  but  database 
close  time  is  also  meeisured. 
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2. 5. 2. 3  Benchmark  Measurements.  The  performance  measurement  is  the 
elapsed  time  of  each  benchmark  operation.  The  HyperModel  benchmark  specifies  that  each 
operation  must  be  run  50  times  and  that  the  database  must  be  shutdown  and  restarted 
before  each  new  type  of  operation  is  started  [15]. 

The  realistic  database  used  in  the  benchmark  and  the  realism  gained  by  the  complex¬ 
ity  of  the  benchmark  operations  are  this  benchmarks  greatest  strengths.  The  benchmark’s 
complexity  is  also  a  weakness.  The  complexity  makes  the  benchmark  difficidt  to  implement 
and  the  results  difficult  to  understand.  This  benchmark  has  not  proved  to  be  very  popular 
with  object-oriented  DBMS  vendors. 

2.5.3  Object  Operations  Version  1  (001).  This  benchmark  wj«  proposed  in 
1991  by  Cattell  and  Skeen  of  Sun  Microsystems  [9].  It  is  simpler  than  the  Simple  Data¬ 
base  Operations  benchmark  (on  which  Cattell  also  worked)  and  is  much  simpler  than  the 
HyperModel  benchmark.  Cattell  and  Skeen  admit  the  benchmark  is  representative  of  a 
smaller  group  of  engineering  applications  than  the  HyperModel  benchmark,  but  state  they 
were  trying  to  create  a  “generic  benchmark”  [9:2-3].  This  has  some  merit  because  the 
OOl  benchmark  is  much  simpler  to  implement  than  the  HyperModel  benchmark.  The 
OOl  benchmark  is  also  known  as  the  “Cattell”  or  “S\m”  benchmark. 

2.5.3. 1  Benchmark  Database.  The  database  used  for  OOl  consists  of  con¬ 
nected  parts.  A  connection  goes  from  one  part  object  to  another  part  object,  and  a  single 
part  object  may  have  several  to  and  from  connections.  For  each  part,  three  connections 
to  other  parts  are  created.  These  coimections  must  ensiire  “locality  of  reference”  by  con¬ 
necting  parts  to  parts  which  are  closest  to  them  (Part-id  numbers  which  are  numerically 
close  are  defined  to  be  close  together)  [9:4-5]. 

OOl  measures  performance  on  two  different  size  databases,  called  small  and  large. 
The  smedl  database  consists  of  20,000  parts  and  60,000  connections.  The  large  database 
is  ten  times  larger  than  the  small,  hence  having  200,000  parts  and  600,000  connections. 
The  authors  of  OOl  intended  that  the  small  database  would  fit  in  the  database  manage¬ 
ment  system’s  memory  buffer  (or  working  set),  while  the  large  database  would  not  fit  in 
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the  memory  buffer.  They  state  that  fitting  in  the  working  set  is  the  “most  important 
distinction”  between  the  small  and  large  databases  [9:5-6]. 

2.5.S.2  Benchmark  Operations.  The  OOl  benchmark  measures  the  perfor¬ 
mance  of  the  following  three  operations: 

1.  Lookup:  Lookup  1,000  (10,000  for  the  large  database)  parts  in  the  database. 

2.  Traversal:  Pick  a  single  part  then  find  all  parts  connected  to  it  (directly  or  indirectly), 

up  to  seven  levels  deep. 

3.  Insert:  Create  100  (1,000  for  the  large  database)  new  parts  with  three  connections 

per  part. 

During  each  of  the  measurements  it  is  required  that  a  null  procedure  (representing 
some  work  done  in  an  application  program)  in  a  programming  language  be  called  at  each 
step.  This  requirement  mzikes  the  benchmark  “interactive”  [9:7]. 

The  benchmark  forbids  any  of  the  operations  being  done  as  a  single  database  call, 
which  is  how  a  relational  DBMS  might  perform  the  operations. 

A  tmique  feature  of  the  OOl  benchm2U'k  is  that  it  must  be  run  remotely.  This  means 
that  the  database  must  reside  on  one  computer  (server),  and  the  benchmark  application 
(client)  must  reside  on  another.  The  two  computers  are  connected  via  a  network.  Figure  4 
shows  this  configuration.  The  authors  state  that  a  remote  database  configuration  is  “the 
most  realistic  representation  of  engineering  and  oflSce  databases”  [9:5]. 

The  three  OOl  benchmark  measmrements  are  nm  ten  times.  The  results  of  the  first 
run  are  the  “cold  start  resvdts,”  and  the  “asymptotic  best  times”  on  the  remaining  runs 
are  the  “warm  start  results”  [9:8]. 

The  OOl  benchmark  has  been  very  popular  with  object-oriented  DBMS  vendors 
(it  has  become  a  de-facto  standard),  probably  in  large  part  due  to  its  simplicity.  This 
benchmark’s  simplicity  and  its  attempt  to  measure  the  effectiveness  of  client  caching  are 
its  strengths.  Its  poor  coverage  of  the  performance  of  a  large  number  of  the  functional 
elements  required  in  an  object-oriented  DBMS  is  its  major  weakness. 
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Figure  4.  Remote  or  Client-Server  Database  Configuration 

2.5.4  007.  The  007  benchmark  was  proposed  by  Carey,  DeWitt,  and  Naughton 
in  1993.  The  work  to  develop  this  benchmark  was  done  at  the  University  of  Wisconsin- 
Madison.  The  007  benchmark  is  proposed  as  a  “comprehensive”  performance  profile  of 
an  object-oriented  DBMS  [7:1].  One  of  the  interesting  features  of  the  007  benchmark 
is  that  it  evaluates  the  performance  of  the  query  processor  of  the  object-oriented  DBMS 
(if  it  has  one).  The  007  benchmark  is  more  complex  than  the  OOl  benchmark  but  it  is 
more  focused  than  the  HyperModel  benchmark.  The  007  benchmark  produces  a  set  of 
numbers  as  its  output  metrics  rather  than  a  single  metric. 

2.5.4- 1  Benchmark  Database.  The  benchmark  database  used  for  the  007 
benchmark  is  very  complex.  The  database  consists  of  a  complex  object  hierarchy.  The 
levels  of  the  hiereirchy  (from  bottom  to  top)  are  listed  below: 

1.  Atomic  Parts 

2.  Composite  Parts  composed  of  atomic  parts  with  associated  documentation  objects 

3.  Base  Assemblies  composed  of  composite  parts 

4.  Complex  Assemblies  composed  of  base  assemblies 

5.  Design  Objects  composed  of  complex  assemblies 


24 


6.  Modules  composed  of  design  objects  with  associated  documentation  objects 

Each  object  class  in  the  database  has  several  discrete  attributes  and  connections  between 
classes  are  set  up  in  the  database.  The  007  benchmark  scales  the  database  to  three 
different  sizes:  small,  medium,  and  large. 

2. 5.4.2  Benchmark  Operations.  The  benchmark  measures  performance  on 
the  following  types  of  operations: 

1.  Traversals 

2.  Queries 

3.  Structural  Modification  Operations 

2. 5.4. 3  Benchmark  Measurements.  The  performance  measurement  is  the 
elapsed  time  (or  response  time)  of  each  benchmark  operation. 

This  benchmark  is  very  new  and  it  remains  to  be  seen  if  it  will  become  popular  (spe¬ 
cific  parts  of  the  benchmark  may  become  popular  with  vendors  if  their  product  performs 
well  on  them).  The  benchmark  is  very  complex  amd  the  results  will  probably  have  to  be 
accompanied  with  a  full  description  of  the  benchmark  operations  (which  was  done  in  [7]). 
The  strength  of  this  benchmark  is  that  it  is  very  comprehensive  in  its  measurements.  This 
benchmau'k  measures  performamce  on  a  wide  spectrum  of  object-oriented  DBMS  function¬ 
ality. 

We  have  reviewed  eight  benchmarks  which  are  used  for  the  performance  measurement 
of  DBMSs.  For  each  benchmark  the  database,  operations,  and  results  used  by  the  database 
have  been  discussed.  Table  1  siunmarizes  the  benchmarks  covered.  DBMS  benchmarks 
can  be  a  useful  tool  for  evaluating  commercial  DBMSs,  but  they  can  also  be  abused.  A 
good  understanding  of  DBMS  benchmarks  can  help  one  to  know  when  and  when  not  to 
use  a  benchmark. 
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Table  1.  DBMS  Benchmark  Summary 


Benchmark 

Domain 

Simplicity 

TPC  Standard 

TPC-A 

Simple  OLTP 

Simple 

Yes 

TPC-B 

DBMS  Stress  Test 

Simple 

Yes 

TPC-C 

Complex  OLTP 

Complex 

Yes 

Wisconsin 

Query  Performance 

Complex 

No 

Simple  Database  Operations 

Object  Operations 

Simple 

No 

HyperModel 

Object  Operations 

Complex 

No 

001 

Object  Operations 

Simple 

No 

007 

Object  Operations 

Complex 

No 

2.6  Computer  Simulation 

In  the  next  two  sections  we  examine  computer  simulation,  including  the  use  of  object- 
oriented  DBMSs  in  computer  simulation  systems  and  simulation  environments.  The  ma¬ 
terial  examined  contributed  to  our  design  of  the  simulation  benchmark. 


2.7  Current  Simulation  Systems  Using  Object-Oriented  DBMSs 

In  this  section  we  examine  the  experiences  of  two  attempts  at  using  an  object-oriented 
DBMS  for  a  simulation  system:  one  using  the  C-i-l-  language  with  an  object-oriented 
DBMS,  and  one  developed  using  the  Ada  language.  Both  of  these  systems  previously  iised 
a  relational  DBMS,  specifically  the  Oracle  relational  DBMS,  for  data  management. 

2.7.1  Visual  Intelligence  and  Electronic  Warfare  Simulation  Workbench.  Woyna, 
et  al.,  developed  a  simulation  system  which  used  the  Versant  object-oriented  DBMS  [40]. 
The  simulation  system  is  called  the  Visual  Intelligence  and  Electronic  Warfare  Simulation 
(VIEWS)  Workbench  software  system.  VIEWS  was  designed  to  enable  analysts  to  build 
detailed  intelligence  and  electronic  warfare  scenarios.  The  scenarios  created  by  VIEWS  are 
used  to  drive  high-resolution  intelligence  eind  electronic  warfare  models.  VIEWS  had  been 
created  using  a  relational  DBMS  and  then  modified  to  use  the  Versant  object-oriented 
DBMS.  The  builders  of  VIEWS  cited  the  following  advantages  of  the  object-oriented 
DBMS  over  the  relational  DBMS  for  their  project: 
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•  Better  Schema  Support:  Use  of  the  relational  DBMS  required  that  the  C+-J-  struc¬ 
tures  be  ‘fattened”  into  relational  tables.  The  object-oriented  DBMS  directly  cap¬ 
tured  the  schema  from  the  C-f-f-  application  code.  Also  it  was  noted  that  the  trans¬ 
lation  between  the  relational  tables  to  the  internal  C-l-t-  representation  required 
extensive  source  code  which  was  unnecessary  for  the  object-oriented  DBMS. 

•  Better  Application  Language  Interface:  Use  of  the  relational  DBMS  required  devel¬ 
oper  knowledge  of  two  languages:  C-f-f  and  SQL.  The  object-oriented  DBMS  did 
not  require  developer  knowledge  of  SQL. 

•  Better  Application  Performance:  Reconstructing  the  complex  objects  in  the  C-h-J- 
program  from  the  normalized  relational  DBMS  tables  required  complex  joins  be¬ 
tween  many  tables  which  was  expensive  in  terms  of  application  performance.  The 
direct  representation  of  C-l— (-  objects  and  the  client  cache  available  in  the  object- 
oriented  DBMS  provided  improvements  in  performance  10  to  100  times  faster  than 
the  relational  DBMS  provided  [40]. 

•  Additional  Features:  The  object-oriented  DBMS  provided  additional  features  such 
as  long  tr2uisactions  and  versioning  of  objects  which  were  not  available  from  the 
relational  DBMS. 

It  took  approximately  10  person-days  of  effort  to  convert  the  30,000  lines  of  C-l-l-  code 
in  the  VIEWS  system  to  the  Versant  object-oriented  DBMS.  No  major  porting  problems 
were  noted.  The  conclusion  of  the  authors  is  that  the  use  of  the  object-oriented  DBMS 
in  the  VIEWS  simulation  system  was  a  Tar  better  approach”  than  using  a  relational 
DBMS  [40:501]. 

2.7.2  Saber  Wargame.  Mathias  extended  a  wargame  simulation  to  work  with  an 
object-oriented  DBMS  [25].  The  wargame,  called  Saber,  was  developed  at  the  Air  Force 
Institute  of  Technology  and  originally  interfaced  with  flat-files  and  the  Oracle  relational 
DBMS.  Mathias  replaced  the  flat-file  and  relational  DBMS  interfaces  with  an  interface  to 
the  Science  Applications  International  Corporation’s  (SAIC)  object-oriented  DBMS.  The 
SAIC  object-oriented  DBMS  was  developed  for  the  US  Air  Force  to  replace  an  existing 
COBOL-based  data  management  system. 
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Saber  uses  the  SAIC  object-oriented  DBMS  as  a  data  repository,  but  simulation 
execution  is  not  done  persistently  and  no  tremsaction  model  exists  to  allow  concurrent 
access  to  executing  simulation  data.  Mathias  found  the  performance  of  the  SAIC  object- 
oriented  DBMS  to  be  slower  than  expected  during  large  data  transfers,  and  concluded  that 
it  seemed  “ill-advised”  to  blindly  replace  an  relational  DBMS  or  flat-file  system  with  an 
object-oriented  DBMS  [25]. 

2.8  Simulation  Environments 

Rooks  proposes  that  simulation  systems  are  composed  of  a  three-level  hierarchy  [34]. 
Rooks’  hierarchy  is  shown  in  Figure  5.  Models  are  created  in  a  simulation  language,  which 
is  a  part  of  a  simulation  system.  Rooks  proposes  that  although  most  simulation  systems 
are  created  around  a  simulation  language,  the  attention  should  be  around  the  simulation 
system.  He  compares  the  design  of  a  simulation  system  to  that  of  a  graphical  user  interface 
(GUI).  The  GUI  may  not  be  perfectly  suited  for  each  application  it  supports,  but  the 
imperfections  are  forgivable  when  compared  to  the  amotmt  of  redundant  effort  saved  in 
the  development  of  each  application.  Rook  proposes  the  following  factors  for  evaluation  of 
a  good  simulation  system:  generality,  completeness,  suitability,  and  flexibility  [34]. 

Pidd  states  that  computer  simulation  systems  have  evolved  from  custom  simulation 
program  creation  (where  the  simulation  program  embodied  the  simulation  logic  and  hard¬ 
coded  the  data)  towuds  the  use  of  data-driven  simulations  [32].  Data-driven  simulations 
reduce  the  time  and  effort  required  to  simulate  a  system  because  they  do  not  require  a 
traditional  programming  effort.  Pidd’s  emphasis,  like  Rooks’,  is  on  reducing  the  amount 
of  redundant  effort  required  during  simulation  development.  Pidd  describes  two  different 
types  of  data-driven  simulation:  general  purpose  data-driven  simulators  and  domain  specific 
data-driven  simulators  [32].  General  purpose  data-driven  simulation  systems  provide  the 
following: 

•  A  pre-programmed  simulation  model 

•  A  model  suited  to  a  wide  range  of  applications 

•  No  traditional  programming  by  the  user 


28 


•  User  provided  data  to  the  simulator 

•  Numerical,  logical,  or  textual  simulation  data 

Domain  specific  data-driven  simulators  attempt  to  provide  the  advantages  of  a  general 
purpose  data-driven  simulation  system,  but  are  more  specific  to  a  single  problem  domain. 
A  domain  specific  datardriven  simulator  mtist  contain  simulation  logic  for  all  anticipated 
instances  of  the  domain.  Pidd  proposes  that  all  data-driven  simulations  must  contain  the 
modules  shown  in  Figure  6  [32]. 

2.8.1  Post-Processing  and  Simulation  Graphics.  Hurrion  reports  that  graphical 
displays  have  been  used  in  discrete-event  simulation  systems  since  the  mid-1970s  [31]. 
Hurrion  notes  that  two  different  approaches  to  graphical  display  output  have  developed: 

•  Post-Processing  or  Playback  Graphics:  This  technique  animates  the  dynamics  of  a 
simulation  but  does  not  allow  user  interaction  while  the  simulation  is  running.  This 
technique  is  the  most  common  technique  used  in  the  United  States. 
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Figure  6.  Components  of  a  Data-Driven  Simulator 

•  Visual  Interactive  Simulation:  This  technique  is  similar  to  the  previous  technique, 
but  allows  a  user  to  interact  with  the  nmning  simulation.  This  technique  is  in 
common  practice  in  the  UK. 

Hurrion  also  describes  two  different  types  of  simulation  graphics  output:  character 
graphics  and  high-resolution  bit-mapped  graphics  [31].  Character  graphics  use  repeated 
drawing  and  erasing  of  characters  on  a  text  screen  to  animate  a  simiilation  model.  High- 
resolution  bit-mapped  graphics  produce  superior  quality  animation  of  simulation  models, 
including  three-dimensional  animation.  Hurrion  notes  that  despite  the  visual  superiority 
of  bit-mapped  graphics,  a  substantial  amoimt  of  time  is  required  to  create  a  quality  bit¬ 
mapped  graphics  animated  display. 

2.8.2  The  Joint  Modeling  and  Simulation  System.  The  Joint  Modeling  and 
Simulation  System  (J-MASS)  is  being  developed  by  the  Department  of  Defense  (DOD) 
as  a  standard  eirchitecture  for  modeling  and  simulation  [6].  J-MASS  provides  a  modeling 
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system  designed  to  support  the  simtilation  requirements  of  the  DOD.  J-MASS  consists  of 
two  major  parts:  the  simulation  support  environment  and  the  modeling  library. 

The  simulation  support  environment  (SSE)  is  a  work  environment  designed  to  assist 
a  modeler  in  five  functions: 

•  Develop  Model  Components:  This  portion  of  J-MASS  allows  a  model  developer  to 
create  individual  components  which  are  compliant  with  the  J-MASS  architecture. 
For  example,  a  model  developer  could  develop  an  engine  component  and  an  avionics 
component  for  an  aircraft  evaluation  model. 

•  Assemble  Model  Components:  This  portion  of  the  SSE  allows  the  composition  of 
model  components,  developed  previously  and  stored  in  the  modeling  library,  into 
new  components.  For  example,  an  aircraft  component  could  be  constructed  from  an 
engine  and  avionics  component  stored  in  the  modeling  library. 

•  Configure  Simulation  Scenarios:  This  portion  of  J-MASS  allows  the  developer  to 
place  a  model  in  a  simulation  scenario.  At  this  phase,  specific  locations  and  terrain 
may  be  specified  for  a  simulation. 

•  Execute  Simulations:  This  function  of  J-MASS  allows  execution  of  a  simulation.  The 
results  of  the  simulation  are  journaled  and  made  aArailable  for  analysis  during  post¬ 
processing.  The  specific  results  to  be  collected  for  post-processing  are  specified  by 
placing  “probe”  points  in  the  model  [6]. 

•  Post-Processing:  This  function  allows  the  J-MASS  user  to  analyize  the  output  of  a 
single  simulation  run  or  several  simulation  runs. 

The  modeling  library  provides  J-MASS  tisers  with  a  source  of  Aralidated  modeling 
components.  Users  of  J-MASS  may  modify  the  existing  components  in  the  modeling 
library  by  changing  attributes  of  the  model  components.  The  modeling  library  provides  a 
model  reuse  repository  for  J-MASS  [21]. 

2.8.3  J-MASS  Data  Management.  The  current  version  of  the  J-MASS  system 
uses  flat-files  for  data  management.  J-MASS  is  ciurently  being  expanded  to  use  a  DBMS, 
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but  the  DBMS  use  is  limited  to  the  model  development,  assembly,  configtiration,  and  post¬ 
processing  functions  of  the  system.  J-MASS  does  not  currently  plan  to  allow  execution 
of  the  simulation  in  a  persistent  environment,  allowing  concurrent  access  to  executing 
simulation  data.  Current  J-MASS  simulations  are  developed,  compiled,  and  executed  as 
Ada  programs. 

2.9  Summary 

In  this  chapter  we  have  reviewed  current  literatme  about  DBMS  benchmarks  and 
computer  simulation.  In  Chapter  III  we  examine  the  functional  capabilities  of  the  three 
commercial  object-oriented  DBMS  used  for  this  research. 
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III.  Object-Oriented  DBMS  Functional  Comparison 


In  this  chapter  we  exeunine  the  functional  capabilities  of  the  Itasca,  Matisse,  and 
ObjectStore  DBMSs.  After  an  overview  of  the  three  DBMSs,  we  examine  several  func¬ 
tional  areas  which  are  thought,  in  current  literature,  to  be  important  for  an  object-oriented 
DBMS  to  support  [2,  4,  8,  37].  We  examine  transaction  properties,  locking  and  concur¬ 
rency  control,  security  authorization,  query  capabilities,  distributed  database  capabilities, 
programming-language  integration,  and  architecture.  The  topics  we  examine  in  this  chapK 
ter  are  far  from  exhaustive,  but  catalog  the  areas  encountered  during  this  research  effort. 

3.1  DBMS  Overview 

We  selected  the  Itasca,  Matisse,  and  ObjectStore  DBMSs  for  our  research  because 
they  represent  a  cross-section  of  the  commercial  object-oriented  DBMS  industry.  Thble  2 
shows  the  estimated  1993  market  share  of  various  companies  in  the  object-oriented  DBMS 
industry  [33].  The  company  names  of  the  DBMSs  we  are  using  are  set  in  italics.  Object- 
Store  leads  the  object-oriented  DBMS  industry  with  a  35%  market  share.  Matisse  holds  a 
solid  5%  market  share,  while  Itasca  (which  is  a  yoimg  company)  holds  a  2%  market  share. 

Cattell  in  [8]  describes  the  ObjectStore  and  Itasca  DBMSs  as  object-oriented  database 
programming  languages.  This  class  of  object-oriented  DBMSs  extend  a  programming  lan¬ 
guage  with  databases  capabilities.  The  ObjectStore  DBMS  extends  the  C-l-f  programming 
language,  while  the  Itasca  DBMS  extends  the  Lisp  programming  language. 

The  Matisse  DBMS  is  not  examined  by  Cattell  in  [8],  but  would  roughly  feJl  into  two 
categories:  object  manager  and  database  system  generator.  At  the  lowest  level,  Matisse 
acts  as  an  object  msmager.  It  provides  no  query  language,  and  has  a  very  basic  data  model. 
Through  the  use  of  data  model  templates,  Matisse  acts  as  a  database  system  generator. 
A  data  model  template  tailors  the  database  to  a  specific  data  model.  Creation  of  these 
templates  is  difficult,  and  certainly  not  automated,  but  provides  some  of  the  capabilities 
of  a  database  system  generator. 
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Table  2.  Estimated  1993  Market  Share  of  Object-Oriented  DBMS  Companies  [33] 


Company 

Annual  Revenue 
(in  millions) 

Market  Sheue 

Object  Design 

$20.4 

35% 

Servio 

$7.3 

13% 

Objectivity 

$6.6 

11% 

Versant 

$5.9 

10% 

Ontos 

$3.0 

5% 

Intellitic 

$2.6 

5% 

BKS 

$2.7 

5% 

02  Technology 

$1.9 

3% 

HP 

$1.8 

3% 

Itasca 

$1.2 

2% 

DEC 

$1.1 

2% 

UniSQL 

$1.1 

2% 

Other 

4% 

3.2  Transaction  Properties 

The  three  commercial  object-oriented  DBMSs  examined  in  this  teseaxch  support  the 
concept  of  a  transaction.  This  section  examines  the  support  the  three  databases  provide 
for  ACID  transactions,  long  transactions,  nested  trans^lctions,  eind  nonblocking  re2id-only 
transactions. 


3.2.1  ACID  Requirements.  Itasca,  Matisse,  sind  ObjectStore  support  transac¬ 
tions  which  pass  the  ACID  test:  atomicity,  consistency,  isolation,  and  durability.  The 
ACID  requirements  for  transactions  are  the  accepted  norm  for  aU  of  today’s  relationsd 
DBMSs.  The  TPC  benchmarks  require  that  all  trsmssictions  meet  the  ACID  test. 

3.2.2  Long  Transactions.  Support  for  long  tremsactions  differs  between  the  three 
databases.  Itasca  and  ObjectStore  provide  support  for  long  transactions  via  a  version 
system.  Versioned  objects  are  checked  out  into  work  areas.  Matisse  does  not  provide 
support  for  long  transactions. 

3.2.2. 1  Itasca.  The  Itasca  DBMS  supports  long  transactions  through  ver¬ 
sion  management.  Objects  are  worked  on  in  a  work  area  called  a  private  database.  When  a 


34 


user  wants  to  work  with  a  group  of  objects  for  a  long  period  of  time,  they  are  checked  out 
of  the  Itasca  shared  database  into  the  user’s  private  database.  This  checkout  creates  a  new 
version  of  all  the  objects  checked  out  in  the  user’s  private  databeise.  In  the  current  Itasca 
implementation,  the  new  version  is  a  copy  of  all  the  objects,  not  a  delta,  so  a  substantial 
amount  of  disk  space  may  be  required  for  this  operation.  After  check  out,  the  objects 
which  remain  in  the  shared  database  become  read  only.  When  the  user  is  finished  with 
work,  the  objects  in  the  private  database  are  checked  back  into  the  shared  database,  and 
are  now  available  to  other  users. 

3. 2.2.2  ObjectStore.  The  ObjectStore  DBMS  also  supports  long  transac¬ 
tions  through  version  management.  To  work  on  a  group  of  objects  for  a  long  period  of 
time  in  ObjectStore,  the  objects  are  grouped  into  a  configuration  and  checked  out  into  a 
workspace.  The  check  out  of  the  configuration  creates  a  private  version  of  all  the  objects  in 
the  configuration.  When  the  user  is  finished  with  work  on  the  configuration  it  is  returned, 
or  checked  in,  to  the  workspace  from  which  it  came. 

ObjectStore  allows  more  than  one  user  to  check  out  a  version  of  a  configuration.  If 
severaJ  different  versions  of  the  configuration  have  been  created,  then  the  versions  must  be 
merged.  ObjectStore  provides  support  for  version  merging. 

In  an  ObjectStore  database,  every  object  inside  a  configuration  has  a  version  history 
which  describes  all  the  versions  of  an  object  which  have  been  created.  The  peet  versions  of 
the  version  history  are  able  to  be  read,  but  cannot  be  changed.  The  configuration  groups 
objects  together  as  a  unit  for  versioning.  A  configuration  is  also  the  unit  of  concurrency 
control,  so  grEuiularity  for  locking  must  be  taken  into  Ewrcoimt  in  the  design  of  configura¬ 
tions.  When  a  new  version  of  a  configuration  is  created,  a  new  version  of  all  the  objects  in 
the  configuration  is  created.  ObjectStore  minimizes  the  storage  overhead  required  for  this 
operation  by  only  storing  the  differences,  or  deltas,  between  database  pages  which  make 
up  the  configuration  [28]. 

ObjectStore  provides  worksp2u:es  to  allow  organization  of  work,  and  to  allow  work  on 
a  distinct  version  of  a  configuration.  The  workspaces  in  ObjectStore  form  a  parent /child 
hierarchy  of  arbitrary  depth.  The  top  node  of  the  tree  is  called  the  global  workspace.  The 
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workspace  hierarchy  is  created  based  upon  the  needs  of  the  workgroup  which  it  is  designed 
to  support. 


3. 2.2. 3  Matisse.  The  Matisse  object-oriented  services  application  program¬ 
ming  interface  (API)  does  not  provide  support  for  long  transactions  [19].  Matisse  provides 
a  version  management  system  but  it  is  limited  to  creating  object  versions  in  a  linear  man¬ 
ner.  New  versions  of  an  object  are  always  created  from  the  most  recent  version  and  no 
“forking”  of  the  version  tree  is  allowed.  Matisse  support  informed  us  that  this  capability 
will  be  available  in  version  2.3  of  the  Matisse  DBMS. 

3. 2. 3  Nested  Transactions.  Nested  transactions  are  useful  in  an  application  where 
a  single  session  may  contain  several  individual  changes  to  data.  Each  individual  change 
may  be  committed  or  aborted  and  then  the  entire  session  may  be  committed  or  aborted. 
An  abort  of  the  session  would  abort  all  the  individual  changes  [8].  Only  ObjectStore  and 
Itasca  support  nested  transactions  at  this  time. 

The  ObjectStore  DBMS  provides  support  for  nested  transactions.  In  ObjectStore, 
transactions  may  be  freely  nested.  The  inner  transactions  depend  upon  the  outer  trans¬ 
action  to  determine  if  they  are  committed.  The  Itasca  DBMS  provides  support  for  nested 
transactions  through  database  sessions.  An  Itasca  database  session  is  a  sequence  of  trans¬ 
actions.  Itasca  sessions  may  be  created  and  destroyed  in  a  “stack-like”  manner  [13:39].  The 
Matisse  object-oriented  services  API  does  not  provide  support  for  nested  transactions. 

3.2.4  Non-Blocking  Read-Only  Transactions.  The  Matisse  DBMS  provides  a 
imique  feature  due  to  its  use  of  intrinsic  versioning.  The  Matisse  DBMS  allows  an  ap¬ 
plication  to  perform  an  “as  oP  transaction.  The  transaction  is  a  non-blocking  read-only 
transaction  which  reads  the  state  of  the  database  at  a  specific  logical  database  time.  The 
transaction  reads  the  version  of  the  object  “as  oP  the  logical  database  time  specified,  eind 
since  (under  intrinsic  versioning)  a  new  version  of  the  object  will  be  created  if  another 
transaction  changes  the  object,  no  locking  is  required. 
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ObjectStore  zoid  Itasca  also  allow  read-only  access  to  previous  versions  of  objects. 
But  the  application  must  have  created,  and  be  responsible  for  managing  those  versions. 
New  versions  are  not  created  every  time  an  object  is  changed,  as  in  Matisse. 

3.3  Locking  and  Concurrency  Control 

Concurrency  control  in  Matisse  and  Itasca  is  at  the  object  level;  in  ObjectStore 
concurrency  is  at  the  database  page  level.  If  configurations  are  being  used  in  ObjectStore, 
then  concurrency  is  at  the  configuration  level.  The  differences  between  page  level  and 
object  level  concurrency  will  be  examined  further  in  Section  3.8. 

5.4  Security  Authorization 

Authorization  to  use  a  database  file  in  ObjectStore  and  Matisse  is  based  upon  UNIX 
file  system  security.  If  a  user  has  access  to  a  file  through  UNIX,  then  that  user  has  access 
to  it  through  ObjectStore  or  Matisse.  Essentially,  these  databases  rely  upon  the  operating 
system  to  provide  security. 

The  Itasca  database  has  a  much  more  sophisticated  security  authorization  system. 
Every  user  of  the  Itasca  database  has  an  identifier,  which  is  independent  of  the  operating 
system  user  identifier.  The  identifier  is  used  to  control  access  to  objects  in  the  Itasca 
shared  database  and  to  provide  access  to  various  private  databases.  Every  operation  on 
an  object  in  Itasca  is  checked  to  ensure  that  the  user  has  authorization.  This  security 
checking  incurs  a  large  amount  of  overhead.  So  Itasca  does  not  do  access  checks  on  each 
object  when  a  user  works  inside  a  private  database.  In  other  words,  if  a  user  hEis  access  to 
a  private  database,  the  user  is  granted  access  to  all  the  objects  in  the  private  databzise. 

3.5  Query  Capability 

All  three  of  the  DBMSs  provide  an  “ad-hoc”  query  capability  through  a  graphical 
browser  tool,  but  only  ObjectStore  and  Iteisca  provide  query  language  support  from  within 
a  database  application.  The  Matisse  DBMS  does  not  provide  a  query  language  capability 
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ItMcaSAt  obj.list; 

Vahicl* : : aalact ( 

obj.liat, 

qUERT_EXPIlESSIOI,  "(  aqnal  MaaniactvrarlaM  \»01dBW>bila\" 

ETO.iftGS  }; 

Figure  7.  A  Sample  Itasca  Query 

in  the  object-oriented  services  API.  Itasca  and  ObjectStore  differ  in  the  implementation 
of  their  application  query  languages. 

3.5.1  Itaaca.  Itasca  provides  a  query  capability  for  DBMS  applications.  The 
query  language  is  limited  to  performing  queries  over  instances  of  classes  [3,  20].  The  scope 
of  a  query  may  be  set  by  the  user  of  the  database  to  private,  shared,  or  global.  If  the  scope 
of  a  query  is  private,  only  instances  of  the  class  in  the  current  private  da4,abase  are  examined 
during  the  query.  If  the  scope  of  the  query  is  shared,  then  only  instances  of  the  class  in 
the  shared  database  are  examined  during  the  query.  If  the  scope  of  the  query  is  global, 
then  all  instances  of  the  class  in  the  current  private  database  and  the  shared  database  are 
considered.  An  example  of  an  Itasca  query  in  the  C-1-+  API  is  shovm  in  Figure  7.  The 
query  places  all  instances  of  the  Vehicle  class  which  have  a  value  of  "Oldsmobile”  for  their 
ManufacturerName  into  the  objJist  set.  Notice  that  the  query  expression  must  be  written 
in  the  Lisp  programming  language,  not  the  C-i-i-  language.  This  is  a  disadvantage  of  the 
cmrent  Itasca  C-l— I-  API  because  the  programmer  must  imderstand  the  C-l— I-  and  Lisp 
programming  languages  along  with  the  additional  semantics  of  the  Itasca  DBMS. 

Itasca  also  provides  a  query  optimizer  to  optimize  the  evaluation  of  application 
queries. 

3.5.2  ObjectStore.  ObjectStore  also  provides  a  query  language  capability  which 
allows  complex  queries  on  collections.  Collections  in  ObjectStore  are  groups  of  objects. 
An  example  of  a  query  in  ObjectStore  is  shown  in  Figure  8  [28].  The  query  determines 
all  the  public  school  students  who  are  also  teenagers.  The  query  shown  in  Figiure  8  uses 
the  ObjectStore  DML.  The  ObjectStore  DML  extends  the  C-l— (-  language  with  database 
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os_datab«a«  *atnd«at_d*tabM«: 
os_S«t<atad«Bt*>  ftpnblic.achool.atndaata ; 

oa_Sat<atadaiit*>  dtaaaagara  ■ 

pablic.achool.atadantaC:  thia->afa  >■  13  dft  tbia->aga  <-  19  :]; 

Figure  8.  A  Sample  ObjectStore  DML  Query 

constructs.  The  DML  query  language  is  a  natural  extension  of  the  C++  programming 
language.  An  ObjectStore  query  may  also  use  the  query  method,  which  is  defined  for 
every  ObjectStore  collection  class. 

ObjectStore  provides  a  query  optimizer  to  optimize  the  evaluation  of  application 
queries. 

3.6  Distributed  Database  Capabilities 

All  three  DBMSs  provide  client/server  type  distributed  capabilities,  but  only  the 
Itasca  DBMS  manages  fully  distributed  databases.  To  set  up  a  distributed  database  in 
Itasca,  several  databases  on  different  machines  are  initialized  then  declared  to  be  network 
sites.  Data  can  migrated  among  network  sites  through  the  use  of  database  administration 
tools  provided  by  It'.^ca.  The  Itasca  DBMS’s  distributed  capabilities  are  very  similar  to 
the  capabilities  provided  by  existing  distributed  relational  DBMSs. 

3. 7  Programming-Language  Integration 

This  functional  criteria  was  identified  by  Cattell  [8].  Cattell  maintains  that  an  object- 
oriented  DBMS  sho\ild  be  closely  integrated  vdth  a  programming  language  to  solve  the 
impedance  mismatch  problem  which  is  conunon  in  relational  DBMSs.  If  a  close  relationship 
exists  between  the  programming  language  emd  the  database,  a  programmer  will  only  have 
to  learn  how  to  use  one  programming  language. 

3.7.1  Itasca.  The  Itasca  DBMS  has  close  ties  with  the  Lisp  programming  lan¬ 
guage,  but  Itasca  has  created  w  interface  to  the  C++  programming  language.  For  this 
research  we  only  investigated  the  C++  interfeure  to  It^tsca.  So  we  sure  unable  to  examine 
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how  well  the  Itasca  DBMS  is  integrated  with  the  Lisp  programming  language.  The  main 
design  goal  of  Itasca  when  they  developed  the  C++  API  was  to  provide  the  fuU  capabilities 
of  the  existing  Itasca  database  through  the  C++  programming  language.  Therefore,  the 
C++  API  is  closely  tied  to  the  C++  language  where  the  C++  object  model  and  the  Itasca 
object  model  are  the  same.  But  where  they  are  different  the  C++  API  forces  the  Itasca 
DBMS  model  upon  the  C++  programmer. 

3. 7.2  Matisse.  The  Matisse  DBMS  makes  no  attempt  to  provide  close  ties  with  a 
programming  language.  The  interface  to  Matisse,  which  is  defined  in  the  C  progranuning 
language,  is  a  functional  interface  into  the  database’s  object  model.  If  a  programming 
language  (such  as  C++)  supports  an  object  model,  then  the  programmer  must  learn  the 
l2mguage’s  object  model  and  the  Matisse  DBMS’s  object  model.  This  separation  ensures 
that  Matisse  is  not  closely  tied  to  any  single  programming  language. 

3.7.3  OhjectStore.  The  ObjectStore  DBMS  provides  close  integration  to  the 
C++  programming  language.  ObjectStore  defines  a  superset  of  the  C++  programming 
language,  called  the  ObjectStore  DML,  which  makes  database  functionality  available  to 
the  programmer.  ObjectStore  also  provides  a  more  functional  interface  to  standard  C++ 
(C++  without  ObjectStore’s  language  extensions)  and  the  C  progrzumning  Izmguage. 

3.8  Database  Architecture 

This  section  will  examine  some  of  the  available  information  on  the  architectures  of 
the  three  commercial  DBMSs  examined  for  this  research  and  the  major  differences  we 
observed  between  the  three  DBMSs  during  this  research. 

A  major  difference  between  ObjectStore,  Itasca,  and  Matisse  is  the  way  data  is  passed 
from  the  DBMS  server  to  an  application.  ObjectStore  is  a  page  server  rather  than  an  object 
server.  The  Itasca  and  Matisse  DBMSs  provide  data  to  a  client  application  as  objects.  For 
example,  if  a  single  object  is  requested,  that  single  object  can  be  sent  to  the  application. 
ObjectStore  transfers  data  to  a  client  application  in  database  pages.  A  database  page  in 
ObjectStore  could  conteiin  several  objects.  So  if  an  object  is  transferred  to  an  application. 
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several  other  objects  will  also  sent.  The  idea  of  the  page  server  is  to  improve  performance. 
If  w  application  works  in  a  small  area  of  its  database,  fewer  transfers  will  be  required 
sending  database  pages  then  sending  objects.  The  disadvantage  of  the  page  server  is  that 
it  is  difficult  to  obtain  fine  grain  concurrency.  The  ObjectStore  DBMS  locks  data  elements 
at  the  page  level  rather  than  the  object  level.  The  server  has  to  lock  all  the  objects  on  a 
page,  when  an  application  may  only  be  working  with  a  single  object. 

ObjectStore  executes  an  ObjectStore  server  process,  osserver,  on  any  machine  which 
acts  as  a  database  server.  Every  machine  which  will  be  a  client  machine  must  be  executing 
a  cache  manager  process,  called  cmgr.  Only  one  cache  manager  is  executed  on  a  client 
machine,  even  if  multiple  clients  are  executed.  The  requirement  for  the  cache  manager 
process  is  a  very  annoying  part  of  ObjectStore.  Neither  Matisse  nor  Itasca  required  any 
software  on  the  client  machines,  except  the  client  application  (although  Matisse  required 
access  to  the  license  manager).  The  cadie  manager  could  also  become  a  bottleneck  if 
several  applications  were  executed  on  the  same  client.  However,  this  is  tmlikely  since  a 
single  workstation  usually  only  serves  one  user. 

3. 8. 1  Itasca.  Very  little  is  discussed  about  the  architecture  of  the  Itasca  DBMS  in 
the  dociunentation  provided  with  the  product,  but  the  architecture  of  the  Orion  DBMS  is 
described  in  [13,  22,  23].  The  Itasca  DBMS  is  the  closest  to  a  traditional  relationtd  DBMS 
architecture  of  the  three  object-oriented  DBMSs  we  examined.  It  provides  a  database 
server  which  is  responsible  for  executing  all  interactions  with  the  DBMS.  Some  support 
for  caching  of  data  is  allowed  in  Itasca,  but  we  encoimtered  problems  when  using  it  in  the 
C-l-4-  API.  Itasca  support  is  creating  better  support  for  caching  in  a  new  version  of  the 
C-l— I-  API,  which  should  improve  the  performance  of  remote  applications. 

3.8. 2  Matisse.  The  architecture  of  the  Matisse  database  is  composed  of  three 
different  levels  built  upon  each  other  [19]: 

•  Micro-Model:  The  micro-model  is  the  lowest  level  of  the  Matisse  database.  It  im¬ 
plements  a  very  simple  data  model  of  objects  and  connections.  The  micro-model  is 
used  to  build  templates. 
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l^ble  3.  Summary  of  Object-Oriented  DBMS  Capabilities 


Capability 

Itasca 

Matisse 

ObjectStore 

ACID  Transactions 

Yes 

Yes 

Yes 

Long  Transactions 

Yes 

No 

Yes 

Nested  Transactions 

Yes 

No 

Yes 

Non-Blocking  Read-Only 
Transactions 

No 

Yes 

No 

Locking  Level 

Object 

Object 

Page  or 
Configuration 

Security 

Database 

Operating  System 

Operating  System 

Application  Query  Language 

Yes 

No 

Yes 

Distributed  Database 

Yes 

No 

No 

Close  Language  Interface 

Yes  (Lisp) 

No 

Yes  (C-h-l-) 

Server  Type 

Object 

Object 

Page 

•  Templates:  A  template  is  a  generic  data  model  which  describes  the  rules  to  follow 
when  designing  a  database  schema.  The  only  template  which  exists  for  Matisse  is 
an  object-oriented  template.  This  template  is  used  by  the  object-oriented  services 
API.  Matisse  states  that  any  data  model  template  can  be  developed  in  on  top  of  the 
micro-model.  For  example,  a  relational  data  model  could  be  developed  for  Matisse. 

•  Schema:  A  schema  is  a  user-defined  structure  which  is  used  by  an  application. 

3.8.3  OhjectStore.  The  ObjectStore  architecture  is  designed  to  enable  a  client 
application  very  fast  access  to  data.  ObjectStore  uses  virtual  memory  mapping  and  client 
caching  to  improve  application  performance.  Virtual  memory  mapping  allows  persistent 
objects  to  be  mapped  via  normal  memory  locations.  If  a  program  tries  to  eiccess  an 
object  which  has  not  been  moved  to  memory,  a  fault  occiirs,  and  the  object  is  brought 
into  memory.  This  process  is  transparent  to  the  application  programmer.  The  advantage 
of  the  ObjectStore  approach  is  that  access  to  persistent  data  once  it  has  been  mapped 
into  the  application’s  virtual  memory  is  as  fast  as  normal  memory  access  [28].  However, 
ObjectStore’s  database  size  is  limited  by  the  virtual  address  space  of  the  machine,  which 
is  2®*  bytes  on  a  Sun  SPARC  machine. 
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3.9  Summary 

This  chapter  has  examined  the  functional  capabilities  and  differences  between  the 
Itasca,  Matisse,  and  ObjectStore  DBMSs.  Several  topics  of  database  functionality  impor¬ 
tant  to  this  research  were  investigated.  A  summary  of  the  functional  capabilities  of  Itasca, 
Matisse,  and  ObjectStore  is  shown  in  Figure  3.  The  next  two  chapters  examine  our  work 
with  the  OOl  benchmark  and  the  simulation  benchmark. 
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IV.  OOl  Benchmark  Analysis,  Design,  and  Implementation 


In  this  chapter  we  describe  our  analysis,  design,  and  implementation  of  the  OOl 
benchmark  for  the  Itetsca,  Matisse,  and  ObjectStore  object-oriented  DBMSs.  Object- 
oriented  analysis  and  design  was  used  on  the  benchmark  requirements  to  produce  a  valid 
design.  Our  analysis  and  design  used  the  methods  presented  by  Goad  and  Yourdon  [10, 
11]  and  by  Rumbaugh,  et  al.  [36].  Implementation  of  the  benchmark  was  done  on  the 
three  commercial  object-oriented  DBMSs  based  upon  the  final  design.  This  chapter  first 
examines  our  object-oriented  analysis  and  design  of  the  OOl  benchmark  zind  then  examines 
the  issues  faced  when  implementing  the  benchmark  on  the  three  object-oriented  DBMSs. 

4.1  Object-Oriented  Analysis 

Otir  analysis  was  based  upon  the  information  contained  in  the  benchmark  specifica¬ 
tion  [9,  14].  The  specification  defines  requirements  for  the  OOl  benchmark  but  does  not 
dictate  a  specific  implementation.  The  goal  of  our  analysis  was  to  build  an  object-oriented 
model  of  the  system  required  by  the  benchmark  specification.  To  do  this,  the  steps  of  the 
Coad/Yourdon  object-oriented  analysis  (OOA)  method  presented  in  [10]  were  used.  These 
steps  are  listed  below: 

1.  Identify  classes  and  objects 

2.  Identify  structures 

3.  Identify  subjects 

4.  Define  attributes 

5.  Define  services 

Oiu:  final  Co2Ki/Youridon  OOA  diagram  is  presented  in  Figme  9.  A  summary  of  Co>.d- 
/Yourdon  notation  appears  in  Appendix  A.  How  we  arrived  at  this  diagram  will  be  ex¬ 
plained,  in  detail,  in  the  following  sections  which  describe  the  steps  taken  in  our  analysis. 

4.1.1  Identify  Classes  and  Objects.  The  OOl  benchmark  centers  around  a  single 
object:  the  part.  Various  actions  are  performed  on  parts  and  the  wall  clock  time,  or  elapsed 
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time,  required  to  perform  them  is  measured.  Therefore,  the  first  class  in  our  analysis  model 
was  the  part  class.  The  benchmark  requires  that  a  group  of  parts  be  created  in  the  database 
(20,000  for  the  small  database,  and  200,000  for  the  large  database).  Each  part  connects 
to  three  other  parts  and  must  know  which  parts  connect  to  it.  Each  part  contains,  as 
attributes,  some  information  about  itself.  Each  connection  also  contsuns  some  information 
about  the  connection  it  represents.  Figiue  10  shows  a  group  of  five  parts  and  some  of  their 
connections.  The  connections  are  only  shown  for  the  part  with  an  identifier  value  of  1,  and 
the  part  with  an  identifier  value  of  4  (part  1  and  part  4).  All  of  the  other  parts  would  have 
three  connections  also,  but  these  connections  are  not  shown  in  Figure  10.  Each  part  knows 
to  which  parts  it  connects.  For  example,  part  1  knows  that  it  connects  to  part  1,  part  2, 
and  part  2  (again).  Also,  part  4  knows  that  it  connects  to  part  1,  part  2,  and  part  3.  The 
benchmark  specification  does  not  mandate  that  the  connections  be  kept  in  any  particular 
order.  Each  part  also  knows  what  parts  connect  to  it.  For  example,  part  1  knows  that  part 
1,  and  part  4  connect  to  it.  Hence,  the  arrows  in  Figure  10  can  be  followed  in  the  reverse 
direction,  as  well  as  in  the  forward  direction.  Notice  that  the  number  of  connections  from 
a  part  is  fixed  at  three,  but  the  number  of  connections  to  a  part  may  vary.  For  example, 
part  1  connects  to  three  parts,  but  has  only  two  connections  to  it. 

The  second  class  we  identified  in  our  2uialysis  was  the  connection  class.  The  connec¬ 
tion  class  holds  the  attributes  required  for  each  coimection.  This  class  is  really  a  relation¬ 
ship  between  two  parts  which  has  attributes.  In  the  Coad/Yourdon  OOA  notation,  this 
is  represented  as  a  class,  but  is  modeled  by  Rumbaugh,  et  al.  more  directly.  Figure  11 
shows  the  representation  of  the  part  and  connection  classes  in  a  Coad/Yourdon  OOA  di¬ 
agram  and  Figure  12  shows  the  same  diagram  represented  as  a  Rumbaugh  object  model. 
The  Coad/Yourdon  OOA  diagram  uses  insteince  connections  to  model  the  structure  shovm 
more  directly  in  the  Rumbaugh  object  model. 

The  last  class  we  identified  in  010*  analysis  of  the  OOl  benchmark  was  the  ool  class. 
This  class  was  needed  for  two  reasons.  First,  it  acts  as  a  container  for  the  parts  and 
indirectly  the  connections.  Second,  it  encapsulates  the  001  benchmark  measures. 
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Figure  11.  Part  and  Connection  Representation  in  a  Coad/Yourdon  OOA  Diagram 


Figure  12.  Pent  and  Coimection  Representation  in  a  Rumbaugh  Object  Model 
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4.1- 2  Identify  Structures.  Coad/Yourdon  specify  two  types  of  structures  which 
should  be  identified  diuring  analysis:  generalization-specification  and  whole-part  [10].  Gen¬ 
eralization-specification  identifies  “is  a”  relationships,  commonly  called  inheritance,  be¬ 
tween  classes.  Whole-peurt  structiu'es  identify  “has  a”,  or  “is  made  up  oP,  relationships 
between  instances  of  classes.  For  the  OOl  benchmark,  we  identified  no  generalization- 
specification  structures,  but  set  up  a  whole-part  structure  between  the  ool  class  and  the 
part  class.  In  our  model,  all  part  instances  are  contzdned  in  a  single  instauice  of  the  ool 
class.  This  whole-paut  structiure  can  be  seen  in  Figure  9. 

4-1.3  Identify  Subjects.  Subjects  are  used  to  break  an  analysis  model  into  differ¬ 
ent  parts  which  form  a  logical  group  of  classes  [10].  Due  to  the  small  number  of  classes  we 
identified,  we  omitted  the  identification  of  subjects,  but  subjects  are  used  in  our  design  for 
the  001  benchmark. 

4. 1- 4  Define  Attributes.  Attributes  are  data,  sometimes  called  state  infc  -nation, 
for  which  a  separate  copy  is  included  in  each  object  of  a  class  [10].  Most  of  the  attributes 
we  identified  for  the  001  benchmark  are  dictated  by  the  benchmark  specification.  The 
specification  describes  the  data  which  must  be  associated  with  each  part  and  connection. 
This  simplified  the  definition  of  attributes  for  our  analysis  model.  For  the  part  class,  the 
following  attributes  were  identified  directly  from  the  benchmeurk  specification: 

•  Id:  This  attribute  acts  as  a  unique  identifier  for  a  part,  which  is  defined  as  an  integer. 
Starting  from  a  value  of  1,  each  part  instance  is  assigned  a  consecutive  integer  for  its 
Id  V2due. 

•  Type:  This  attribute  is  defined  as  a  string.  The  string  may  take  on  values  between 
part-typeO  and  part-type9.  The  specific  value  for  a  part  instance  is  r2indomly 
selected. 

•  X:  This  attribute  is  defined  as  am  integer  between  0  and  99,999.  The  value  represents 
the  y-cixis  value  of  a  location.  The  specific  value  for  a  part  instaince  is  randomly 
selected. 
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•  Y :  This  attribute  is  defined  as  an  integer  between  0  and  99,999.  The  value  represents 
the  y-axis  value  of  a  location.  The  specific  value  for  a  part  instance  is  randomly 
selected. 

•  Build:  This  attribute  represents  a  date  and  time.  The  value  for  a  specific  p2ui;  is 
randomly  selected  from  a  10-year  range. 

For  the  connection  class,  the  following  attributes  were  identified  directly  from  the  bench¬ 
mark  specification: 

•  Type:  This  attribute  is  defined  the  same  as  the  Type  attribute  of  the  part  class. 

•  Length:  This  attribute  is  defined  the  same  as  the  X  and  Y  attributes  of  the  part 
class.  The  value  represents  the  length  of  the  connection  between  the  two  parts. 

For  the  oc  *  jlass,  we  identified  only  a  single  attribute: 

•  NumParts:  This  integer  attribute  contains  the  number  of  parts  which  have  been 
created  in  the  database.  Its  value  will  be  20,000  for  the  small  database,  and  200,000 
for  the  large  database. 

4- 1-5  Define  Services.  Services  are  the  methods  or  functions  which  classes  can 
perform  [10].  As  was  the  case  for  defining  attributes,  the  benchmark  specification  dictated 
most  of  the  required  services.  The  services  included  the  benchmark  measmres  and  the 
creation  and  deletion  of  the  benchmark  database.  For  the  ool  class,  the  following  services 
were  identified: 

•  Load:  This  service  creates  the  benchmark  database.  For  the  smatll  database,  20,000 
parts  and  60,000  connections  are  created.  For  the  large  database,  200,000  parts  and 
600,000  connections  are  created. 

•  Clear:  This  service  removes  all  the  persistent  part  and  connection  instances  from  the 
databeise. 

•  Lookup  Measure:  This  service  looks  up  1,000  remdomly  selected  parts  from  the  data¬ 
base.  For  each  part,  a  null  procedure  (a  procedure  which  performs  no  purpose 
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except  to  exist)  is  called  passing  the  X,  V,  and  Type  attributes  of  the  part^.  The 
1,000  lookups  are  repeated  ten  times.  The  first  time  is  reported  as  the  cold  time 
for  the  measure  2md  the  asymptotic  best  time  is  reported  as  the  warm  time  for  the 
measvure  (or  the  average  of  the  second  through  the  tenth  time). 

•  Forward  Traversal  Measure:  This  service  finds  all  the  parts  connected  to  a  randomly 
selected  peut,  up  to  seven  levels  deep.  3,280  psirts  will  be  found  in  this  measure.  For 
eeich  part,  a  null  procedme  is  called  passing  the  X,  V,  and  Tj/pe  attributes  of  the 
part.  The  traversal  is  repeated  ten  times.  The  first  time  is  reported  as  the  cold  time 
for  the  measure  and  the  asymptotic  best  time  is  reported  as  the  warm  time  for  the 
measmre  (or  the  average  of  the  second  through  the  tenth  time). 

•  Reverse  Traversal  Measure:  This  service  is  the  same  as  the  forward  traversal,  except 
2ill  the  parts  which  connect  to  a  randomly  selected  part  are  found.  An  indeterminable 
number  of  parts  will  be  foimd  in  this  measure.  The  time  for  this  measurement  is 
normalized  for  comparison  with  the  forwaurd  traversal  measure.  The  traversal  is 
repeated  ten  times.  The  first  time  is  reported  as  the  cold  time  for  the  measure  and 
the  as3rmptotic  best  time  is  reported  as  the  warm  time  for  the  measure  (or  the  average 
of  the  second  through  the  tenth  time). 

•  Insert  Measure:  This  service  will  create  100  new  parts  in  the  database.  Three  new 
connections  will  also  be  created  for  each  new  part.  As  each  part  is  being  created,  a 
null  procedure  is  called  to  obt2dn  values  for  the  X  and  Y  attributes  of  the  part.  The 
100  inserts  zure  repeated  ten  times.  The  first  time  is  reported  as  the  cold  time  for  the 
measure  emd  the  asymptotic  best  time  is  reported  as  the  warm  time  for  the  measure 
(or  the  average  of  the  second  through  the  tenth  time). 

For  the  part  class,  the  following  services  were  identified: 

•  Forward  Traversal:  This  service  calls  the  null  procedure  defined  in  the  forwaird 
traversjJ  measure  of  the  ool  class,  then  calls  the  forward  traversal  service  for  each 
p2ui;  which  is  connected  to  it.  Hence,  this  service  is  reciursive. 

‘When  implementing  the  null  procedure,  care  must  be  taken  to  ensure  that  the  compiler  does  not 
remove  the  call  during  optimization. 
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•  Reverse  Traversal:  This  service  calls  the  null  procedure  defined  in  the  reverse  traver¬ 
sal  measure  of  the  ool  class,  then  calls  the  reverse  traversal  service  for  each  part  which 
connects  to  it.  Hence,  this  service  is  also  recursive. 

No  services  were  identified  for  the  connection  class.  We  aJso  note,  via  the  shaded  arrow 
in  Figure  9,  that  the  ool  class  calls  the  services  defined  in  the  part  class.  Specifically,  the 
traversal  measures  are  set  up  as  recinrsive  calls  to  the  traversal  services  in  the  part  class. 

Om  analysis  of  the  OOl  benchmark  is  now  complete.  Due  to  the  use  of  object- 
oriented  design  and  then  implementation  in  an  object-oriented  programming  language 
using  an  object-oriented  DBMS,  no  major  changes  to  the  model  defined  in  our  analysis  are 
necessary  during  design  and  implementation.  Our  design  is  described  in  the  next  section. 

4-2  Object-Oriented  Design 

The  object-oriented  design  for  the  OOl  benchmark  took  the  anedysis  model  developed 
in  the  previous  section  and  created  a  design  for  the  benchmark.  To  do  this,  the  steps  of 
the  Coad/Yourdon  object-oriented  design  (OOD)  method  were  used  [11].  The  steps  to 
the  OOD  method  are  the  seune  as  those  we  used  in  our  analysis,  but  attention  focuses  on 
breaking  up  the  analysis  model  into  the  following  four  components: 

1.  Human  interaction  component 

2.  Problem  dommn  component 

3.  Ihsk  management  component 

4.  Data  msmagement  component 

For  the  OOl  benchmark,  we  ignored  the  design  of  the  task  management  component,  and 
the  data  management  component.  The  task  management  component  was  ignored  because 
the  benchmark  does  not  need  to  concurrently  execute  several  tasks,  and  the  data  man¬ 
agement  component  was  ignored  because  the  object-oriented  DBMS  will  provide  all  of  the 
benchmark’s  data  management.  The  next  two  sections  will  describe  otn:  design  for  the 
human  interaction  component  and  the  problem  domain  component. 


52 


4-2.1  Design  of  the  Human  Interaction  Component.  The  OOl  bendunark  needs 
a  minimal  user  interface.  The  benchmark  is  nm  from  the  UNIX  command  line.  The 
executable  program  for  the  benchmark  is  called  bench  and  it  accepts  the  following  inputs: 

•  Object-Oriented  DBMS  Authorization:  This  information,  which  is  different  for  each 
object-oriented  DBMS,  provides  the  benchmark  with  the  information  necessary  to 
connect  to  the  object-oriented  DBMS  and  perform  any  authentication  required  by 
the  secTirity  features  of  the  object-oriented  DBMS. 

•  Benchmark  Operation  to  Execute:  This  information  directs  the  program  which  service 
of  the  ool  class  to  run.  The  choices  include:  load,  clear,  lookup,  f  trav,  rtrav, 
and  insert. 

e  Number  of  Parts  in  the  Database:  To  avoid  a  database  query  to  determine  the  number 
of  parts  in  the  database,  which  could  bias  the  cold  results,  the  number  of  peuts  in 
the  database  is  provided  to  the  benchmark  program. 

e  Random  Stream  Number:  The  random  number  generator  used  for  the  benchmark 
program  contains  100  pre-defined  streams  of  random  numbers.  This  value  selects  the 
stream  used  for  the  measure  to  be  executed. 

The  bench  program  provides  text-based  output  which  reports  the  benchmark  results.  For 
each  benchmark  measure  the  results  include  the  elapsed  time  for  each  of  the  ten  iterations 
and  the  cold  and  warm  times. 

4.2.2  Design  of  the  Problem  Domain  Component.  For  the  OOl  benchm2u:k  the 
problem  domain  component  is  derived  from  our  analysis  model.  The  following  changes 
were  made  to  our  analysis  model  dining  design  of  the  problem  domain  component: 

•  Variable  Number  of  Connections  From  a  Part:  Our  analysis  model  identified  that 
each  part  would  connect  to  exactly  three  other  parts.  To  allow  for  possible  variations 
in  the  number  of  parts  a  part  can  connect  to,  this  was  made  a  variable  relationship. 
This  additional  capability  allows  for  variations  to  be  made  in  the  benchmznk  data¬ 
base. 
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1 .  Benchmark  Support 


2. 001  Benchmark 


Figure  13.  OOD  Diagram  for  the  OOl  Benchmark — Subject  Layer 

•  The  001  Class  Maintains  a  Collection  of  the  Connection  Instances:  To  facilitate 
testing  and  to  help  eliminate  object  leziks,  the  ool  class  was  designated  to  keep  track 
of  aU  the  connection  instances  as  well  as  all  the  part  instances. 

•  Identification  of  Persistent  Classes:  The  part  and  connection  classes  were  identified 
as  persistent  classes.  That  is,  they  are  the  classes  which  were  stored  in  the  object- 
oriented  DBMS.  To  differentiate  persistent  classes  from  non-persistent  classes,  the 
persistent  classes  are  shaded  in  our  OOD  diagrams. 

•  Addition  of  the  Benchmark  Support  Library:  The  benchmark  support  library  was 
added  as  a  separate  subject.  The  benchmark  support  library  consists  of  routines  for 
measuring  elapsed  time  and  generating  random  numbers.  This  library  is  documented 
in  Appendix  F.  The  dice  class  is  used  by  the  ool  benchmark  to  generate  uniform 
random  numbers  and  the  stopwatch  class  is  used  to  measure  the  duration  of  the  001 
benchmark  operations. 

Figure  13  shows  the  subject  layer  of  om:  001  benchmeurk  design  and  Figure  14  shows  omr 
final  Coad/Yourdon  OOD  diagram. 

Due  to  the  use  of  the  C-f-l-  programming  language  with  an  object-oriented  DBMS, 
our  design  changed  little  for  each  of  the  three  implementations.  The  next  section  describes 
our  implementations  of  the  001  benchmark  on  three  commercial  object-oriented  DBMSs. 
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4.S  Benchmark  Implementations 

This  section  describes  the  implementation  of  our  benchmark  design  on  each  of  the 
the  three  commercial  object-oriented  DBMSs.  We  first  examine  the  changes  necessary  to 
our  design  to  implement  the  benchmark  on  the  database.  Then  we  examine  the  problems 
we  encountered  implementing  and  running  the  benchmuk  on  the  database. 

4.3.1  Itasca  Implementation.  Our  OOl  benchmark  implementation  for  the  Itasca 
DBMS  was  created  using  the  Itasca  C-H-  API  and  written  in  the  C-i-l-  programming 
language.  Four  versions  of  the  benchmark  software  were  written  for  Itasca.  We  first 
examine  the  benchmark  implementation  in  Itasca  and  then  cover  the  problems  encountered 
during  implementation. 

The  Itasca  DBMS  requires  that  all  persistent  classes  be  subclasses  of  a  class  called 
class.  Our  design  was  changed  to  reflect  this  requirement.  A  final  Coad/Yourdon  OOD 
diagram  for  our  Itasca  implementation  is  shown  in  Figure  15.  The  OolAbstractConnection 
class  will  be  discussed  during  the  coverage  of  our  implementation  problems  with  Itasca. 

To  create  the  persistent  classes  in  Itasca  the  dynamic  schema  editor  was  used.  This 
program  is  the  recommended  method  of  creating  classes  in  the  Itasca  database  [20].  Once 
the  classes  were  created,  the  C-l-f-  definitions  were  dumped  into  a  C-h+  header  file  (this 
file  was  called  schema.hh).  The  entire  benchmark  was  implemented  with  a  single  C-f+ 
executable  program,  called  bench. 

Because  the  Itasca  DBMS  was  implemented  in  the  Lisp  programming  language,  we 
considered  creating  a  Lisp  version  of  the  OOl  benchmark  for  Itasca.  But  Itasca  support 
told  us  that  we  would  not  see  a  significant  performance  difference  due  to  the  requirement 
that  the  OOl  benchmark  be  nm  remotely.  A  Lisp  version  would  have  had  better  perfor¬ 
mance,  according  to  Itasca  support,  only  if  it  was  allowed  to  nm  exclusively  in  the  native 
(local)  Lisp  API. 

Itasca  support  examined  our  benchmark  code  on  several  occasions  and  helped  to 
correct  several  problems  with  our  program.  The  next  section  discusses  the  problems  we 
encountered  implementing  the  OOl  benchmark  in  the  Itasca  DBMS. 
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The  Itasca  version  of  the  OOl  benchmark  took  the  most  time  to  complete.  A  large 
number  of  problems  were  encountered.  The  problems  encountered  with  the  Itasca  imple¬ 
mentation  fall  into  two  groups:  problems  encountered  with  the  Itasca  DBMS  and  problems 
with  the  C-l— 1-  API.  The  reason  we  note  the  difference  is  that  the  C-h-l-  API  is  a  new  prod¬ 
uct  for  Itasca  and  will  no  doubt  change  a  good  deal  in  the  near  future,  while  the  Itasca 
DBMS  should  be  more  mature.  Most  of  the  difficult  and  time  consuming  problems  en¬ 
countered  with  the  Itasca  implementation  involved  trouble  with  the  C+-(-  API  and  not 
the  Itasca  DBMS.  The  following  is  a  list  of  the  problems  we  encountered  with  the  Itasca 
DBMS  and  C-l-t-  API: 

•  Slow  Database  Commit'.  The  commit  of  data  takes  a  significant  amount  of  time  in 
the  Itasca  database.  Itasca  support  indicated  that  they  have  been  working  on  the 
time  the  database  commit  takes,  which  they  report  is  mostly  taken  up  by  object 
hashing  and  writing  to  the  disk.  They  report  that  they  were  able  to  improve  the 
performance  by  400%,  but  at  a  cost  of  significant  increases  in  memory  use.  We  hope 
that  futtire  versions  of  Itasca  will  solve  this  performance  fiaw. 

•  Unbounded  Growth  of  the  Database  Log  File:  Due  to  the  limited  amount  of  disk  space 
available  to  perform  our  research  we  encountered  a  problem  with  the  Itasca  database 
log  file.  The  log  file  stores  the  transactions  which  are  in  progress.  In  theory,  the  log 
file  should  shrink  when  database  commits  are  performed,  but  the  current  version  of 
Itasca  only  garbage  collects  the  log  file  during  a  restart  of  the  database.  To  shrink 
the  size  of  the  log  file,  the  database  m\ist  be  shutdown  wd  resteirted.  The  exp2uiding 
log  file  filled  the  entire  disk  on  our  test  workstation  on  several  occasions. 

•  Flat  Name-Space  for  Database  Objects:  The  neune-space  in  the  Itasca  database  used 
for  cl2iss  names  is  fiat  across  all  users  of  the  database.  Only  one  class  in  the  database 
may  have  the  same  name.  This  can  cause  confiicts  with  existing  names.  For  the  OOl 
benchmeurk,  all  the  persistent  class  names  were  prefixed  with  the  string  Ool  to  avoid 
this  problem. 

•  Private  Database  Numbering:  All  private  databases  ^ure  assigned  a  number  when 
they  are  created.  The  user  has  no  control  over  private  database  number  assignment 
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but  is  responsible  for  keeping  track  of  these  numbers.  In  Itasca  eadi  user  contains 
an  attribute  which  defines  the  private  databases  which  have  been  created  by  that 
user.  When  a  user  coimects  to  Itasca  (in  a  program  or  using  one  of  the  Itasca  GUI 
tools)  Itasca  places  that  user  in  the  first  private  database  that  exists  at  the  current 
site  and  in  the  list  of  private  databases  created  by  that  user.  If  there  sure  no  private 
databsuses  which  have  been  created  by  the  user,  then  the  user  is  plsured  in  a  private 
databstse  to  which  the  user  has  been  granted  surcess.  If  a  user  is  not  allowed  in 
smy  private  database,  then  the  user  is  plsured  in  private  database  -1  (which  means 
no  private  database).  The  private  database  system  wsts  difficult  to  decipher  when 
we  first  stsuted  work  with  Itasca  and  often  plsured  our  benchmark  progrsun  in  the 
wrong  private  database.  There  is  currently  no  logicsd  or  symbolic  representation  for 
the  private  databsise  numbers  and  a  user  is  responsible  for  keeping  track  of  all  the 
private  database  numbers  they  own. 

•  No  Circular  Class  References  in  the  C-t-h  API:  The  Itasca  C++  API  could  not 
represent  circular  references  between  classes.  For  example,  if  class  A  has  a  refer¬ 
ence  to  class  B  then  class  B  is  not  allowed  to  have  a  reference  to  class  A.  This 
limitation  is  only  in  the  Itasca  C++  API  and  is  em  imusual  limitation.  The  C++ 
language  does  not  have  this  limitation  and  the  Itasca  database  does  not  have  the 
limitation.  Itasca  c\istomer  support  told  us  that  the  problem  would  be  cleared  up 
in  the  next  release.  To  solve  the  problem,  a  dummy-superclass  was  declared  for  the 
OolConnection  class  named  OolAhstractConnection.  The  OolPart  class  contained 
references  to  the  OolAhstractConnection  class  rather  than  the  OolConnection  class 
and  the  Ool  Connection  class  contained  the  other  side  of  our  reference  with  a  refer¬ 
ence  to  the  OolPart  class.  To  the  application  program  the  end  result  appe2us  as  if 
there  is  a  circular  reference,  but  there  is  some  overhead  incurred  because  of  the  use 
of  the  superclass. 

•  Lack  of  Non-Persistent  Method  Support  in  the  C++  API:  The  Itasca  C++  API  does 
not  allow  non-persistent  methods  or  normal  C++  class  methods  to  be  easily  created. 
The  Itasca  dynamic  schema  editor  (DSE)  is  the  preferred  tool  to  create  a  database 
schema  for  Itasca.  Once  a  schema  is  created  in  the  Itasca  database,  the  DSE  is  used 
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to  automatically  generate  a  C++  header  file.  Since  the  DSE  does  not  allow  the 
creation  of  C++  methods  inside  of  classes,  the  generated  C++  header  file  must  be 
edited  to  add  the  methods.  Itasca  support  reconunended  using  a  C  include  file  (a 
“.h”  file)  to  define  the  methods  and  including  the  file  in  the  generated  C++  header 
file.  This  was  done,  and  as  Itasca  support  indicated,  it  saved  a  good  deal  of  effort 
when  the  C++  header  file  was  regenerated. 

A  related  problem  to  the  inability  to  allow  non-persistent  methods  was  the  lack  of 
the  ability  to  declare  attributes  in  a  class  private  to  the  class.  This  is  an  ability  of 
the  C++  programming  language  which  was  not  included  in  the  Itasca  C++  API. 
Itasca  support  indicated  that  this  was  not  supported  because  there  is  no  concept  of 
the  private  class  member  in  the  Itasca  database. 

•  Transient  Memory  Management:  A  large  problem  encoimtered  with  the  Itasca  C++ 
API  was  a  misunderstanding  on  our  part  as  to  how  the  API  handled  transient  memory 
associated  with  persistent  objects.  When  an  object  is  read  from  the  database,  some 
transient  memory  is  allocated  to  hold  that  object.  In  Itasca,  the  management  of  that 
memory  is  not  considered  the  responsibility  of  the  DBMS  but  of  the  application. 
Though  this  is  not  documented  in  the  C++  API  manuals  [20],  the  following  rules 
(which  we  deduced)  were  confirmed  with  Itasca  support: 

—  If  a  persistent  object  is  created  via  the  C++  new  operator,  then  the  application 
is  responsible  for  its  transient  memory. 

—  If  a  persistent  object  is  retrieved  via  an  Itasca  query,  then  the  application  is 
responsible  for  its  transient  memory. 

—  If  a  persistent  object  is  retrieved  via  an  Itasca  iterator,  then  the  application  is 
not  responsible  for  its  transient  memory. 

This  was  a  very  confusing  portion  of  the  Itasca  C++  API  and  caused  us  to  create 
huge  memory  lezdcs  in  om  benchmark  program.  The  above  rules  were  only  deduced 
after  Itasca  support  removed  the  majority  of  the  memory  leaks  from  our  program. 
Itasca  support  informed  us  that  tr2uisient  memory  management  should  have  been  in 
the  manual  emd  that  the  documentation  oversight  will  be  corrected  in  the  future. 
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^.S.t  Matisse  Implementation.  Our  001  implementation  for  the  Matisse  DBMS 
was  written  in  the  C++  programming  language,  but  the  object-oriented  services  API  to 
Matisse  only  uses  the  C  programming  language  subset  of  C++.  Three  versions  of  the 
benchmark  were  written  for  Matisse.  We  first  examine  the  benchmark  implementation  in 
Matisse  and  then  cover  the  problems  encountered  diuring  implementation. 

Two  executable  programs  were  created  for  the  Matisse  version  of  the  OOl  bench¬ 
mark.  The  first  program  was  the  bench  program  and  the  second  was  the  ool  schema.  The 
bench  program  implemented  the  benchmark  operations  and  the  oolschema  program  loaded 
or  removed  the  database  schema  from  the  Matisse  DBMS. 

The  Matisse  database  makes  no  attempt  to  be  transparent  to  the  application  pro¬ 
gram.  The  bindings  to  the  object  oriented  services  are  a  strictly  functional  API  written 
in  the  C  programming  language.  The  object-oriented  nature  of  the  Matisse  database  is 
inside  the  database  and  is  completely  different  firom  the  object-oriented  natiure  of  the  C++ 
programming  language.  Therefore,  persistent  objects  in  Matisse  are  not  implemented  as 
C++  classes.  The  persistent  objects  are  represented  only  in  the  database  and  communi¬ 
cation  takes  place  through  the  Matisse  object-oriented  services  API.  The  non-persistent 
methods  which  are  needed  for  the  benchmark  were  implemented  as  C  functions  with  calls 
to  the  database. 

Matisse  support  examined  oiu:  implementation  on  several  occasions  and  mztde  several 
suggestions  for  improvement.  The  various  improvements  me«ie  based  upon  their  input  will 
be  covered  in  the  next  section. 

Most  of  the  problems  encoimtered  during  the  development  of  the  Matisse  implemen¬ 
tation  involved  difficulty  in  understanding  the  operation  and  use  of  the  Matisse  DBMS. 
The  following  is  a  list  of  the  problems  we  encountered  with  the  Matisse  DBMS: 

•  Version  Collection:  The  Matisse  system  performs  intrinsic  versioning  of  all  objects. 

If  an  object  is  changed  in  the  database,  a  new  version  is  created.  This  scheme 

requires  a  leurge  amoimt  of  disk  space.  To  collect  old  versions  amd  to  compact  the 

current  objects  in  the  database,  a  program  called  mt-collect-versions  must  be  used. 

This  progr2im  starts  up  a  task  in  the  database  which  nms  concurrently  with  zJl  other 
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database  tasks.  This  program  needs  to  be  run  rather  frequently  or  a  l^lrge  amoimt 
of  disk  space  is  uimecessarily  consumed.  Due  to  the  limited  amount  of  disk  space 
available  for  our  test  workstation  we  frequently  ran  out  of  database  disk  space  (or 
silo  space  in  Matisse  terms). 

The  collect  versions  operation  occurs  in  asynchronous  mode  and  the  Matisse  server 
creates  a  dedicated  thread  to  perform  it.  Thus,  there  was  no  easy  method  to  measure 
the  time  it  takes  to  execute.  The  collect  versions  operation  has  two  main  goals; 
collecting  the  old  versions  of  the  objects  and  compacting  objects  into  the  minimum 
number  of  buckets.  Matisse  support  informed  us  that  most  Matisse  customers  create 
a  crontab  entry  to  nm  the  collect  versions  in  a  periodic  fashion. 

Because  the  version  collection  program  runs  concurrently  with  other  database  tasks 
and  can  not  be  measured,  we  were  imable  to  determine  the  exact  impact  which  this 
has  on  database  performance.  We  ran  severed  tests  with  different  delays  between 
using  the  version  collection  and  running  the  benchmark.  It  appears  to  be  very  im¬ 
portant  to  analyze  the  application  for  which  the  database  is  being  used  and  set  up  a 
reasonable  schedule  for  the  use  of  the  version  collection  progreim.  If  any  application 
is  running  at  the  seime  time  the  version  collection  program  is  nmning,  the  other  ap¬ 
plication  has  priority.  Therefore,  database  “down”  time  must  be  planned  to  run  the 
version  collection  program. 

•  Schema  Creation:  The  first  version  of  our  001  benchmark  implementation  for  Ma¬ 
tisse  created  the  database  schema  during  the  load  operation  and  removed  the  schema 
during  the  cleeir  operation.  Customer  support  pointed  out  to  us  that  this  was  causing 
some  £idditional  overhejul  of  which  we  were  unaware.  The  Matisse  object  services 
provides  two  different  libraries  which  applications  can  link  to:  the  data  management 
(DS)  library  aind  the  data  and  schema  management  (DE)  library.  The  DS  library 
is  used  for  applications  which  modify  the  schema  in  the  Matisse  database.  The  DE 
library  is  used  for  applications  which  work  with  objects  in  the  database  but  do  not 
change  the  schema.  There  is  some  additional  trjinsaction  overhe2id  when  the  DS 
librEiry  is  used  to  ensiure  that  schema  changes  are  done  properly.  Matisse  support 
had  nm  our  original  program  to  lozwi  the  data,  then  relinked  with  the  DE  library  to 
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run  the  benchmark  measures  and  suggested  that  we  do  the  same.  Because  the  OOl 
benchmark  does  not  require  schema  evolution,  we  decided  to  separate  the  schema 
creation  from  the  benchmark  program  entirely.  A  new  program,  czdled  ool schema, 
was  created  to  load  the  benchm2U'k  schema  into  the  Matisse  database.  This  program 
was  linked  with  the  DS  library  and  the  benchmark  program,  bench,  was  linked  with 
the  DS  library.  The  new  program  made  the  Matisse  implementation  closer  to  our 
Itasca  and  ObjectStore  implementations.  For  the  Itasca  implementation,  the  schema 
is  created  using  the  dynamic  schema  editor,  not  by  the  benchmark  progrcim.  For  the 
ObjectStore  implementation,  the  schema  is  created  by  the  OSCC  compiler  while  the 
program  is  being  compiled  and  linked. 

•  Use  of  Client  Memory  Transport  for  Local  Clients:  The  Matisse  DBMS  allows  the  use 
of  memory  transport  for  local  clients  to  the  database,  rather  than  the  use  of  network 
transport.  Memory  transport  is  designed  to  improve  local  client  performance  by 
using  shared  memory  cind  semaphores.  Matisse  support  described  to  us  the  method 
for  enabling  local  client  memory  transport  and  we  encountered  no  problems  with 
it.  However,  we  feel  that  this  setting  should  be  the  default  for  local  clients,  not  an 
esoteric  parameter  in  the  Matisse  configmration  file. 

•  Non-Blocking  Read-Only  Transactions:  A  feature  of  Matisse  derived  from  the  intrin¬ 
sic  versioning  is  the  ability  to  perform  a  non-blocking  read-only  trJinsaction  on  any 
old  version  of  2m  object.  Matisse  support  recommended  using  this  feature  for  the 
lookup  and  traversal  mecisures  of  the  OOl  benchmark  since  they  are  read  only.  We 
did  not  use  this  ability  in  oiu:  OOl  benchmark  implementation  because  aJl  the  other 
implementations  used  full  transeictions,  but  do  note  that  it  is  a  powerful  feature  of 
the  Maiisse  DBMS. 

•  Object  Identifier  (OID)  Usage:  Matisse  support  informed  us  that  OIDs  in  Matisse 
are  valid  over  the  entire  lifetime  of  an  object.  Therefore,  it  is  efficient  to  reeid  in  all 
the  OIDs  necessary  for  a  program  and  then  use  them  where  necessary.  We  had  been 
looking  up  OEDs  auring  each  transaction. 

•  Benchmark  Design:  Matisse  support  was  the  only  vendor  support  group  which  ques¬ 
tioned  oirr  design  of  the  OOl  benchmark.  The  basic  problem  which  they  haid  was 
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with  our  use  of  two  inverse  relationships  between  the  p2irt  and  connection  classes. 
They  felt  that  only  one  inverse  relationship  was  necessary.  They  also  noted  that  the 
specific  c^u:din^llity  in  our  object  model  was  not  used  in  the  implementations.  In  the 
OOl  benchm^lrk  there  are  always  three  connections  from  a  part,  but  the  implemen¬ 
tations  zdlowed  this  number  to  be  larger  or  smaller.  These  questions  from  Matisse 
support  were  resolved  through  the  communication  of  om:  design  to  them. 

4.3.3  ObjectStore  Implemertation.  Our  OOl  implementation  for  the  Object- 
Store  DBMS  was  written  in  the  C-l— 1-  programming  language  using  the  ObjectStore  DML 
extensions  to  the  language.  Eight  versions  of  the  benchmark  were  written  for  ObjectStore. 
No  changes  to  the  final  design  were  necessary  for  the  ObjectStore  version  of  the  OOl 
benchmark. 

Development  of  our  benchmark  with  the  ObjectStore  DBMS  was  the  closest  to  simply 
developing  the  benchmark  in  the  C-l— f-  programming  language.  ObjectStore  appears  to 
the  developer  <is  a  persistent  C-t-f-  implementation  with  additional  support  for  database 
constructs.  Persistent  classes  in  ObjectStore  are  no  different  from  transient  objects  in 
C-I-+,  except  for  the  method  of  their  creation. 

ObjectStore  support  was  offered  the  opportunity  to  examine  the  benchmark  program 
we  developed  but  they  declined.  They  did  provide  a  good  deal  of  help  with  specific  prob¬ 
lems  encountered  during  development  but  did  not  provide  much  input  on  improving  the 
performemce  of  the  final  progreun. 

The  following  is  a  list  of  the  problems  we  encountered  with  the  ObjectStore  DBMS: 

•  Persistent  Object  Leak:  The  greatest  problem  with  the  ObjectStore  version  of  the 
OOl  benchmark  was  a  persistent  object  leeJt.  This  problem  had  occurred  with  all 
the  implementations,  but  it  was  discovered  first  in  the  ObjectStore  implementation 
of  the  benchmauk.  In  a  progreunming  language,  it  is  very  important  for  a  program 
to  give  back  dynamic  memory  to  the  operating  system.  If  this  is  not  done  then 
the  program  is  smd  to  have  a  memory  leak.  With  am  object-oriented  DBMS  it  is 
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possible  to  have  a  memory  leak  into  the  persistent  storage  area.  We  have  called  this 
a  persistent  object  leak  (since  objects  are  lost  in  the  database  not  memory). 

In  our  001  implementation,  the  persistent  object  leak  occurred  diiring  the  insert 
measurement.  The  insert  was  creating  100  new  part  objects  and  300  new  connections 
objects,  but  was  only  deleting  the  100  part  objects.  Thus,  every  benchmark  nm  on 
the  database  was  losing  300  connection  objects  in  the  depths  of  the  database.  The 
problem  was  detected  when  the  reverse  traversal  measurement  started  crashing  the 
benchmark  program.  The  lost  connections  maintained  a  connection  to  the  part  to 
which  they  were  connected,  so  that  lost  connections  could  be  traversed  during  the 
reverse  traversal  measure.  When  a  lost  connection  was  traversed,  the  application 
foimd  that  the  connecting  part  did  not  exist  and  would  crash.  The  C++  destructor 
functions  of  the  part  and  connection  classes  were  modified  to  ensure  that  all  the 
inserted  objects  were  deleted. 

Persistent  object  leak  could  become  a  major  problem  with  object-oriented  DBMSs, 
and  we  noted  that  none  of  the  vendors  have  provided  tools  to  assist  the  application 
developer  to  find  lost  objects. 

•  DBMS  Tuning  Parameters:  We  foimd  the  number  of  performance  tuning  parameters 
available  in  the  ObjectStore  DBMS  staggering.  Despite  this  large  number  of  parame¬ 
ters,  ObjectStore  does  not  provide  a  guide  to  database  performance.  A  large  amount 
of  time  can  be  spent  tuning  an  ObjectStore  database  application  for  performance, 
and  this  accounted  for  the  the  eight  different  versions  of  the  benchmjirk  we  created. 
We  saved  a  great  deal  of  time  by  examining  an  ObjectStore  007  benchmeirk  imple¬ 
mentation  developed  at  the  University  of  Wisconsin-Madison  and  used  many  of  the 
"tame  performance  settings  [7]. 

•  Index  Use  During  Queries:  Unlike  current  relational  databases,  an  index  must  exist 
for  it  to  be  used  for  a  query.  The  query  optimizer  in  ObjectStore  will  not  create  an 
index  if  you  have  not  defined  one.  This  also  seemed  to  be  the  case  in  Itasca.  Matisse 
provides  no  support  for  a  query  language  which  can  be  used  in  an  application. 
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Verification  of  Benchmark  Program  Correcinets 

It  was  very  important  that  the  implementations  of  the  OOl  benchmark  be  correct. 
Two  methods  were  used  to  verify  the  benchmark  implementations:  code  inspections  and 
testing. 

Code  inspections  were  used  for  two  ptirposes:  error  identification  and  implementa¬ 
tion  consistency.  First,  the  inspections  identified  errors  in  the  benchmark  implementations. 
Second,  the  inspections  identified  variations  in  the  source  code  of  the  three  implementa¬ 
tions.  Every  attempt  was  made  to  keep  the  implementations  consistent  and  the  inspections 
avoided  unnecessary  deviations  in  the  implementations.  A  code  style  guide  developed  for 
this  research  helped  to  simplify  inspections  of  the  somce  code.  The  style  guide  also  pro¬ 
vided  a  consistent  style  for  all  the  implementations.  The  code  style  guide  is  documented 
in  Appendix  G. 

A  large  amoimt  of  testing  was  done  on  the  benchmark  programs.  Testing  was  done 
on  very  small  databases  (10  parts  with  30  connections)  and  used  debug  code  developed 
into  all  the  source  code  of  every  benchmark  implementation.  A  consistent  style  of  debug 
output  was  obtained  through  the  use  of  a  macro  package  developed  by  Microsoft  [27].  This 
package  is  dociimented  in  Appendix  G. 

Additional  verification  was  provided  by  the  DBMS  vendor  support  groups,  who  ex¬ 
amined  and  executed  the  benchmark  implementations  at  their  sites.  An  exception  to  this 
was  ObjectStore,  who  never  examined  the  source  code  of  our  bendimark  implementation 
for  their  database. 

4.5  Summary 

This  chapter  has  covered  the  analysis,  design,  Euad  implementation  of  the  OOl  bench¬ 
mark  developed  for  this  research.  We  have  examined,  in  detml,  the  problems  encoimtered 
when  working  with  each  of  the  three  commerciEd  object-oriented  DBMS.  Chapter  VI  ex- 
Eunines  the  results  obtEuned  from  our  nms  of  the  OOl  benchmark.  The  next  chapter. 
Chapter  V,  exEunines  the  simulation  benchmEirk  developed  for  this  research.  The  simula- 
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tion  benchmark  investigates  the  ability  of  the  three  commercial  object-oriented  DBMSs  to 
support  simulation  systems. 
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V.  Simulation  Benchmark 


In  this  chapter  we  examine  the  requirements,  design,  and  implementation  of  the 
simvdation  benchmark.  The  simulation  benchmark  is  a  new  benchmark  designed  for  the 
computer  simulation  domain.  The  simulation  benchmark  sets  up  a  complete  simulation 
system  in  an  object-oriented  DBMS.  The  purpose  of  the  benchmark  is  to  examine  the 
performance  and  functional  capabilities  of  the  three  commercial  object-oriented  DBMSs 
available  at  AFIT  for  use  with  computer  simulation. 

5.1  Benchmark  Description 

This  section  describes  our  simulation  benchmark  for  object-oriented  DBMSs.  The 
benchmark  simulates  aircraft  searching  for  moving  trucks  over  an  area  of  land.  When  an 
aircraft  finds  a  truck,  the  location  and  the  time  when  the  truck  was  found  is  logged.  The 
simulation  used  in  the  benchmark  is  a  stochastic  discrete-event  simulation  model  [24].  The 
benchmark  requires  an  entire  simulation  system  to  be  built  for  each  object-oriented  DBMS 
which  is  tested.  The  object-oriented  DBMS  must  be  used  in  all  portions  of  the  simulation 
system  for  storage,  including  simulation  execution.  This  benchmark  attempts  to  re-draw 
the  traditional  line  between  the  simulation  software  and  the  database  (traditionally  imple¬ 
mented  as  fiat  files).  We  examine  the  requirements  of  the  benchmark  simiilation  system. 
Then  the  benchmark  measures  are  described;  and,  finztlly,  we  provide  some  justifications 
for  om:  benchmark. 

5.1.1  Benchmark  Requirements.  The  software  required  for  our  benchmark  con¬ 
sists  of  several  psuts:  a  model  (of  the  aircraft  and  the  trucks,  and  a  map),  a  means  to 
configure  a  scenario,  a  simulation  executive  (to  control  simulation  execution),  a  post  pro¬ 
cessing  module  to  view  simulation  results,  and  support  libraries  (which  provide  support  for 
both  simulation  and  benchmarking).  The  benchmark  quantitatively  measures  performance 
in  all  the  areas  of  the  simulation  system.  The  benchmark  is  also  a  functioned  benchmark 
since  we  examine  the  qualitative  capability  of  an  object-oriented  DBMS  to  support  a  gen¬ 
eral  simulation  system. 
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Before  examining  specific  requirements  of  the  benchmark  software,  we  enumerate 
several  general  rules  to  which  the  benchmark  must  adhere.  These  rules  apply  to  every 
portion  of  the  benchmark  software. 

1.  Persistence  of  the  Simulation  Model:  The  simulation  model  used  in  the  simulation 
benchmark  must  be  persistent.  The  model  must  be  saved  between  one  nm  of  the 
benchmark  program  and  another. 

2.  No  Flat-Files:  The  benchmark  software  may  not  use  fiat-files  for  data  storage  at 
any  point  in  time;  the  object-oriented  DBMS  must  be  used.  This  ensures  we  are 
measuring  the  objected-oriented  DBMS’s  ability  to  store  data,  not  the  operating 
system’s. 

3.  The  Object-Oriented  DBMS  Must  Manage  the  Cache:  The  benchmark  software  must 
allow  the  object-oriented  DBMS  to  decide  when  data  is  brought  in  from  the  disk.  The 
benchmark  program  may  not  read  all  data  from  the  object-oriented  DBMS,  run  the 
simulation,  and  then  save  all  data  to  the  DBMS.  This  ensures  we  are  measuring  the 
objected-oriented  DBMS’s  ability  to  cache  data,  not  the  benchmark  programmer’s. 

4.  Multi-User  Capability:  The  benchmark  implementation  must  be  able  to  provide 
multi-user  support.  For  example,  two  copies  of  the  simulation  system  must  be  able 
to  execute  at  the  seime  time  inside  the  same  database.  Severed  of  the  benchmark 
measures  depend  on  this  ability.  The  object-oriented  DBMS’s  support  for  transac¬ 
tions,  concurrency,  versioning,  and  locking  should  be  used  to  provide  the  multi-user 
capability,  not  custom  code  written  by  the  benchmsirk  progreimmer. 

5.  Use  Common  Graphics  Code:  All  benchmark  implementations  must  use  the  same 
graphics  code  for  the  user  interface.  The  goed  of  the  benchmark  is  to  measure  the 
performance  of  the  object-oriented  DBMS,  not  to  examine  the  performance  of  a 
graphics  library.  But  it  is  important,  from  a  functional  point  of  view,  to  note  the 
ability  of  the  object-oriented  DBMS  to  work  with  the  graphics  library. 

Our  intent  is  that  a  single  program  be  built  to  represent  the  simulation  system.  The 
program  must  provide  a  windowing-system  user  interface  to  the  simulation  system.  To 
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support  the  benchmark  measures,  the  program  must  be  instrumented  with  routines  to 
measure  elapsed  time  and  throughput. 

5. 1.1.1  Simulation  System  and  Model  Capabilities.  This  section  describes 
the  requirements  of  the  simulation  environment  for  the  simiilation  benchmark.  All  of  the 
requirements  described  in  this  section  must  be  supported  by  the  benchmark  program.  The 
benchmark  requirements  are  as  follows: 

•  Connect  and  Disconnect  to  an  Object-Oriented  DBMS:  The  benchmark  program 
must  provide  a  method  for  the  user  to  connect  to  an  object-oriented  DBMS.  The 
program  should  prompt  the  user  for  any  authentication  information  required  by  the 
DBMS  to  grant  access. 

•  Benchmark  Simulation  Model:  The  benchmark  supports  a  single  model,  that  of 
aurcraft  searching  for  trucks  on  a  map.  Since  the  benchmark  is  a  discrete-event  simu¬ 
lation,  the  simulation  state  evolves  over  time  with  changes  occurring  instantaneously 
at  selected  points  in  time  [24].  The  changes  in  simulation  state  are  triggered  by 
events.  The  simulation  contains  two  type  of  simulation  objects:  active  and  passive. 
Active  simulation  objects  execute  events,  while  passive  objects  do  not.  The  aircraft 
and  trucks  are  active  simulation  objects,  and  the  map,  which  is  a  board  of  hexes,  is 
passive.  We  will  first  describe  the  state  information,  or  attributes,  required  by  each 
object  in  our  model,  then  describe  the  events  required  by  the  model. 

The  2drcraft  in  the  simulation  each  maintain  the  following  information: 

—  Aircraft  Name:  This  value  is  a  string  of  9  characters  which  uniquely  identifies 
an  aircraft.  The  string  is  in  the  format  “AIR-nnnnn”.  The  nnnnn  represents 
a  unique  nmnber  for  the  aircraft.  Niunbers  are  assigned  sequentially  to  each 
aircraft  starting  from  a  value  of  0. 

-  Location:  All  sdrcraft  know  where  they  are  located.  The  value  is  a  hex  identifier 
specifying  a  single  hex  on  the  hex  board. 

-  Home  Base:  This  value  is  a  string  of  10  characters.  The  value  is  randomly 
selected  firom  the  strings  {“HOME-BASEl” . . .  “H0ME-BASE9”}. 
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-  Miscellaneous  Simulation  Data:  This  value  is  a  string  of  50  bytes.  The  data 
must  be  selected  from  at  least  ten  different  values  and  represents  additionEil 
information  which  would  be  reqviired  by  a  more  complex  simulation  system 
than  is  represented  in  this  benchmark. 

The  trucks  in  the  simulation  each  maintmn  the  following  information: 

-  Truck  Name:  This  value  is  a  string  of  9  characters  which  uniquely  identifies 
a  truck.  The  string  is  in  the  format  “GND-nnnnn”.  The  nnnnn  represents  a 
unique  number  for  the  truck.  Numbers  are  assigned  sequentially  to  each  truck 
starting  from  a  value  of  0. 

-  Location:  All  trucks  know  where  they  are  located.  This  value  is  a  hex  identifier 
specifying  a  single  hex  on  the  hex  board. 

-  Type:  This  value  is  a  string  of  11  characters.  The  value  is  randomly  selected 
from  the  strings  {“TRUCK-TYPEl” . . .  “TRUCK-TYPES”}. 

-  Payload:  This  value  is  a  string  of  11  characters.  The  value  is  randomly  selected 
from  the  strings  {“CARGO-TYPEl” . . .  “CARGO-TYPES”}. 

-  Miscellaneous  Simulation  Data:  This  value  is  a  string  of  50  bytes.  The  data 
must  be  selected  from  at  least  ten  different  values  and  represents  additional 
information  which  would  be  required  by  a  more  complex  simulation  system 
than  is  represented  in  this  benchmark. 

There  are  three  different  types  of  events  required  by  the  simulation  benchmark  model. 

They  are  the  following: 

-  Aircraft  Move:  This  event  moves  an  aircraft  into  a  randomly  selected  adjacent 
hex. 

-  Search:  This  event  causes  an  aircraft  to  search  the  hex  in  which  it  is  located 
for  any  trudcs.  If  any  trucks  are  found  they  are  logged,  so  that  they  may  be 
reported  in  the  summary  report. 

-  Truck  Move:  This  event  moves  a  truck  into  a  randomly  selected  adjacent  hex. 


71 


Figure  16.  Event  Graph  (Queuing  Model)  for  the  Simulation  Benchmark 

The  Aircraft  Move  and  Search  events  aure  sent  to  aircraft  and  the  Truck  Move  is 
sent  to  trucks.  An  event-graph  of  the  events  is  shown  in  Figure  16.  The  event-graph 
notation  is  defined  by  Law  in  [24].  E2u:h  node  of  the  graph  represents  an  event  type. 
Each  arc  represents  how  an  event  may  be  scheduled  by  another  event  or  by  itself. 
In  the  simulation  benchmark,  the  Aircraft  Move  event  schedules  a  Search  event, 
the  Search  event  schedules  an  Aircraft  Movement  event,  and  the  Truck  Move  event 
schedules  another  Truck  Move  event.  The  graph  indicates  that  the  Aircraft  Move 
and  Truck  Move  events  must  be  scheduled  initially  for  each  aircraft  emd  truck.  All 
events  are  scheduled  in  the  future  based  upon  a  random  draw  from  an  exponential 
distribution.  For  the  Aircraft  Move  event;  a  mean  value  of  60  seconds  is  used;  for 
the  Search  event,  a  mean  value  of  30  seconds  is  used;  and  for  the  Truck  Move  event, 
a  mean  value  of  600  seconds  is  used. 

Create,  Store,  and  Copy  Simulation  Models:  The  benchmsurk  program  must  be  able 
to  create  and  manage  several  simulation  models.  For  simplicity,  only  the  search 
model  defined  above  is  required,  but  multiple  distinct  instances  of  this  model  must 
be  allowed  in  the  database  (each  instance  having  a  separate  simulation  state).  Each 


model  instance  is  created  by  the  user  and  assigned  a  name.  The  user  is  also  allowed 
to  make  a  copy  of  zmy  existing  model. 

•  Configure  a  Scenario:  The  benchmark  program  must  be  able  to  configure  a  scenario 
within  a  simulation  model.  To  create  a  scenario,  the  user  enters  the  number  of 
aircraft,  the  number  of  trucks,  and  map  size  desired  for  the  scenario.  The  user  can 
also  remove  all  the  scenario  elements  from  a  model,  effectly  destroying  the  scenario. 

•  Support  a  Simulation  Time  Slice:  The  user  is  able  to  define  a  simulation  time  slice. 
The  time  slice  defines  how  many  seconds  of  time  are  simulated  in  a  single  step  of  the 
simulation,  although  many  simulation  events  (or  none)  may  be  simulated  during  a 
single  time  slice.  The  time  slice  defines  the  granularity  of  the  simulator.  The  time 
slice  is  defined  as  a  time  in  seconds  and  must  have  a  vzdue  of  1  or  greater.  The 
time  slice  defines  the  fixed-increment  time  advance  for  the  simulation  benchmark 
progr2Lm  [24]. 

The  time  slice  is  used  to  define  the  benchmark’s  transaction  model.  Each  time  slice 
is  executed  as  a  single  transaction  in  the  database.  This  definition  of  the  transaction 
model  allows  us  to  vary  the  amount  of  work  done  during  a  single  transaction. 

•  Support  Batch  and  Real-Time  Simulation  Execution:  The  benchmark  program  is 
able  to  execute  the  program  to  a  goal  time,  nmning  as  fast  as  the  system  will  allow, 
although  the  simulation  must  still  observe  the  time  slice  setting.  This  type  of  execu¬ 
tion  is  called  batch  execution.  The  program  must  also  support  real-time  execution 
at  a  user  specified  ratio  with  wall  clock  time.  The  user  is  allowed  to  specify  a  time 
ratio  for  the  simulation  to  execute  at.  The  time  ratio  is  defined  as  the  ratio  of  wall 
clock  time  to  simulation  time.  To  simulate  at  the  specified  time  ratio,  the  simulation 
executes  a  single  time  slice,  then  delays  for  the  remainder  of  the  duration  specified 
by  the  time  ratio.  For  example,  if  the  time  slice  is  set  to  60  seconds,  and  the  time 
ratio  is  set  to  1,  then  if  the  simulation  requires  4  seconds  to  simulate  the  60  second 
(simulation  time)  time  slice,  the  simulation  would  delay  for  56  seconds  before  the 
next  time  slice  weis  executed.  If  the  time  ratio  were  set  to  2  the  delay  would  be  116 
seconds.  And  if  it  were  set  to  0.5,  then  the  delay  would  only  be  26  seconds. 
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•  Support  Post-Processing  of  Simulation  Data:  The  benchmark  program  provides  two 
types  of  post-processing.  First,  a  graphical  hex  map  displaying  all  the  aircraft  and 
trucks  in  the  current  model  is  available  at  user  dem2md  (the  map  image  is  not  re¬ 
quired  to  be  stored  in  the  DBMS).  Second,  a  summary  report  of  all  the  trucks  found 
by  aircraft  since  the  simulation  started  may  be  generated  on  demand.  The  s\im- 
mary  report  is  generated  to  a  text  file.  This  stimmary  report  contains  the  following 
information  for  each  truck  found: 

-  Report  Time:  The  simulation  time  when  the  report  was  generated. 

-  Aircraft  Name:  The  name  of  the  aircraft  which  foimd  the  truck. 

—  Truck  Id:  The  name  of  the  truck  found. 

—  Location  Found:  The  location  of  the  truck  (the  hex  identifier)  when  it  was 
found. 

—  Time  Found:  The  simulation  time  when  the  truck  was  located. 

This  section  has  exeimined  the  reqviirements  of  the  simulation  benchmark.  The  next 
section  describes  the  operations  which  our  benchmark  measiures. 

5.1.2  Benchmark  Measures.  This  section  examines  the  quantitative  measure¬ 
ments  req\xired  by  the  simulation  benchmcirk.  These  measurements  are  done  for  two  dif¬ 
ferent  database  sizes.  The  first  database  contains  1,000  trucks,  500  aircraft,  and  a  50  x  50 
hex  board.  This  database  is  called  the  small  database.  The  second  database  contains 
10,000  trucks,  5,000  aircraft,  and  a  100  x  100  hex  board.  This  database  is  called  the  large 
database.  The  following  benchm2u:k  measures  are  made  on  the  simulation  system: 

•  Model  Creation:  Measiure  the  elapsed  time  required  to  create  a  new  model  instance. 
This  meaisurement  does  not  include  the  time  to  create  the  scenario.  This  is  a  measure 
of  elapsed  weiU  clock  time. 

•  Scenario  Creation:  Measure  the  elapsed  wall  clock  time  required  to  create  the  sce- 
nzurio. 

•  Simulation  Execution  (Hour  Run):  Use  the  run-until  function  of  the  simulation 
executive  to  run  the  simulation  for  1  hour  with  the  simulation  time  slice  set  to  60 
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seconds.  This  is  a  measure  of  elapsed  wall  clock  time.  This  measure  is  also  run  with 
a  time  slice  of  600,  1800,  and  3600. 

•  Simulation  Throughput:  Run  the  simulation  at  the  fastest  ratio  possible  with  the 
time  slice  set  to  60  seconds.  The  ratio  must  be  at  steady  state  for  at  least  10  minutes 
of  real  time.  Report  the  time  ratio  obtained.  90%  of  the  time  slices  must  maintain 
the  reported  ratio. 

•  Version  Creation:  Measure  the  elapsed  time  to  create  a  new  version  of  a  model  which 
has  been  nm  through  time  for  1  hour.  The  new  version  may  be  a  complete  copy  of 
the  model  or  a  new  version  created  by  the  object-oriented  DBMS. 

•  Map  Creation:  Measure  the  elapsed  time  for  the  creation  of  the  map  in  the  post¬ 
processing  portion  of  the  simulation  system.  This  measurement  is  taken  for  a  paused 
simulation,  a  running  simulation,  and  a  simulation  nmning  on  a  remote  computer. 

•  Report  Creation:  Measure  the  elapsed  time  for  the  creation  of  the  summary  report 
containing  a  list  of  the  trucks  found  by  aircraft  during  simulation  execution. 

This  section  has  described  the  quantitative  benchmark  measures.  In  addition  to 
the  quantitative  measures,  the  functional  ability  of  the  DBMS  in  which  the  benchmark  is 
implemented  to  support  the  simulation  should  be  noted.  The  next  section  describes  the 
justification  for  our  benchmark. 

5.1.3  Benchmark  Justification.  Several  choices  were  made  in  developing  a  mean¬ 
ingful  benchmark  for  the  simulation  domain.  This  section  exeimines  our  reasons  for  the 
current  benchmark  and  attempts  to  justify  them  eis  much  as  possible.  To  the  best  of  our 
knowledge,  this  is  one  of  ^he  first  attempts  to  build  a  persistent  simulation  environment 
using  Ein  object-oriented  DBMS;  the  only  other  work  in  this  area  we  have  encoimtered  is 
described  in  [40]. 

We  feel  that  the  model  implemented  in  this  benchmark  provides  several  of  the  ele¬ 
ments  found  in  most  discrete-event  simulation  systems.  While  it  is  true  that  the  Eurtions 
of  the  simulation  objects,  the  aircrEdt  Emd  trucks,  sure  not  very  interesting,  they  do  provide 
a  constsmt  IoekI  on  the  simulation  system  when  it  is  running.  A  problem  with  the  use 
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of  objects  which  move  so  regularly  is  that  not  all  simulations  exhibit  this  property.  For 
example,  it  is  possible  that  a  simulation  may  be  very  inactive  for  a  long  period  of  time, 
then  be  required  to  process  a  large  number  of  events  in  a  short  period  of  time.  While  this  is 
possible,  we  felt  it  was  much  more  important  to  provide  a  predictable  load  (on  average)  so 
that  it  was  possible  to  imderstand  how  many  events  the  simulation  would  execute  during 
a  period  of  time.  With  this  understanding  it  could  be  possible  to  use  our  results  to  predict 
performance  of  another  simulation  based  upon  the  expected  niunber  of  events  required  to 
be  executed  during  the  simulation. 

It  is  eJso  important  to  note  that  the  queue  of  events  pending  for  the  simulator  is 
going  to  be  of  the  same  magnitude  as  the  number  of  active  objects  (aircraft  and  trucks)  in 
the  simulation.  This  is  due  to  the  way  events  are  scheduled  (see  Figure  16).  Here  we  also 
note  that  not  all  simulation  systems  exhibit  this  property. 

The  inclusion  of  a  full  graphical  user  interface  (GUI)  as  a  required  part  of  the  bench¬ 
mark  could  be  controversicd,  but  we  feel  that  any  modem  simulation  system  is  going  to 
require  a  graphical  interface  which  is  well  integrated  with  the  simulator.  Therefore,  it  is 
essential  to  measure  the  object-oriented  DBMS’s  ability  to  interface  with  the  GUI  while 
supporting  the  simulation  system. 

An  eaulier  version  of  the  benchmark  included  terrain  data  for  each  hex  on  the  hex 
board,  but  we  decided  to  eliminate  this.  The  terrmn  data  complicated  implementation 
emd  did  not  provide  enough  benefit  to  justify  keeping  it.  The  same  amount  of  data  can  be 
created  in  the  database  by  creating  more  aircraft  and  trucks  in  a  scenario. 

We  have  provided  the  map  and  the  summary  report  to  represent  two  types  of  infor¬ 
mation  which  a  simulation  system  would  have  to  keep:  current  and  cumulative.  The  map 
refiects  the  ciurent  state  of  the  simulation  and  requires  that  all  the  objects  in  the  simula¬ 
tion  be  traversed.  The  summeuy  report  requires  that  information  which  w£is  available  in 
the  past  be  maintained,  via  a  logging  method  of  some  sort,  md  reported.  Both  types  of 
information  axe  required  in  typical  simulation  systems. 
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Figure  17.  Simulation  Benchmark  Design — Big  Picture 
5.2  Simulation  Benchmark  Design 

For  our  design  of  the  simulation  benchmark,  as  in  the  design  of  the  OOl  benchmark, 
we  used  the  methods  described  by  Goad  and  Yourdon  [11]. 

The  big  picture  of  our  benchmark  design  is  shown  in  Figure  17.  The  user  of  the 
benchmark  program  interacts  with  a  graphical  user  interface.  The  user  interface  interacts 
with  a  database  interfjK;e  module  which  hides  the  specifics  of  the  database  firom  the  user 
interface  implementation.  The  database  interface  interacts  with  the  simulation  objects 
based  upon  commands  firom  the  user  interface.  All  the  simulation  objects  are  required  to 
be  persistent  and  are  therefore  stored  in  the  object-oriented  DBMS.  The  next  sections 
describe  our  design  of  the  human  interaction  component  (the  user  interface),  the  problem 
domain  component,  and  fin2dly  the  database  interface. 

5.2.1  Design  of  the  Human  Interaction  Component.  Unlike  the  OOl  benchmark, 
the  simulation  benchmark  requires  a  leirge  user  interface  component.  Since  we  developed 
our  implementations  on  UNDC  workstations,  we  decided  to  develop  the  benchmark  user  in¬ 
terface  with  the  OSF/Motif  graphical  user  interface  (GUI).  Oiu:  design  and  implementation 
of  the  user  interface  were  designed  to  be  compliant  with  the  OSF/Motif  Style  Guide  [29], 
with  additional  ideeis  from  Heller  [17].  In  this  section  we  overview  our  user  interface  design. 
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Figure  18.  Simulation  Benchmark  User  Interfju^e  Menu 


The  main  menu  for  our  user  interface  to  the  simulation  benchmark  program  is  shown 
in  Figure  18.  The  main  menu  is  the  primary  interface  for  the  program  user.  The  user  is 
allowed  to  select  any  menu  item  which  is  not  dimmed  (menu  choices  are  only  allowed  when 
they  have  meaning  to  the  program).  The  following  menus  appear  on  the  menu  bar: 


•  File:  Contains  commands  to  interact  with  the  object-oriented  DBMS,  to  create,  copy, 
select,  and  close  simulation  models,  and  to  exit  the  program.  The  choices  on  this 
menu  are  described  below: 

—  Connect:  This  choice  allows  a  user  to  connect  to  the  object-oriented  DBMS. 
The  user  is  prompted  for  any  authentication  information  required  by  the  DBMS. 
Then  a  connection  is  established. 

—  Disconnect:  This  choice  terminates  the  program’s  connection  to  the  object- 
oriented  DBMS. 

—  New  Model:  This  choice  creates  a  new  model  instance  in  the  database.  The 
user  is  prompted  for  a  neune  for  the  model.  Then  the  model  is  created.  The 
name  for  the  new  model  must  not  already  be  the  name  of  an  existing  model. 
After  the  model  is  created,  it  is  set  as  the  current  model. 
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—  Open  Model:  This  choice  presents  the  user  with  a  list  of  the  existing  models  in 
the  database.  The  user  selects  one  of  the  choices  and  that  model  is  set  as  the 
cmrent  model. 

-  Close:  This  choice  closes  the  current  model. 

—  Save  As:  This  choice  creates  a  copy  of  the  current  model.  The  user  is  prompted 
for  a  name  for  the  copy,  and  then  a  copy  is  made.  The  cxirrent  model  remains 
the  same.  If  the  user  wishes  to  work  with  the  new  copy,  the  current  model  must 
be  closed.  Then  the  copy  must  be  opened  with  the  File  |  Open  Model  command. 

-  Exit:  This  choice  exits  the  program. 

•  Configure:  Contains  commands  to  build  amd  clear  a  scenario  for  the  current  model. 
The  choices  on  this  menu  are  described  below: 

-  Create  Scenario:  This  choice  creates  a  scenario  for  the  current  model.  The  user 
is  prompted  for  the  number  of  adrcraft,  the  number  of  trucks,  and  the  size  of 
the  hex  board  desired.  Then  the  a  scenario  with  the  correct  number  of  trucks 
and  aircraft  is  created.  If  a  scenario  already  exists  in  the  current  model,  it  is 
cleared  before  the  new  scenario  is  created.  The  simulation  time  for  the  current 
model  is  set  to  time  0. 

—  Clear  Scenario:  This  choice  clears  the  current  model’s  scenario. 

•  Execute:  Contains  commands  to  control  the  execution  of  the  simulation.  The  choices 
on  this  menu  are  described  below: 

—  Set  Time  Slice:  This  choice  sets  the  value  of  the  time  slice  for  the  simulation. 
If  the  time  slice  is  not  set  by  the  user,  it  defaults  to  a  value  of  1. 

—  Set  Time  Ratio:  This  choice  sets  the  value  of  the  time  ratio  for  the  simulation. 
The  time  ratio  is  the  ratio  of  wall  clock  time  to  simulation  time.  If  the  time 
ratio  is  not  set  by  the  user,  it  defaults  to  a  value  of  1. 

-  Run:  This  choice  starts  running  the  simulation  at  the  set  time  ratio.  A  dialog 
appears  to  the  user  reporting  the  ctirrent  simulation  time  and  statistics  about 
the  execution  of  the  simulation.  The  dialog  is  updated  at  the  end  of  each 
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AFIT  SiaBrach  KEPORT  (SIM  TIME:  300  aac) 

AIRCRAFT  1 

TROCK  1 

LOCATIOI 

1 

SIM  TIME 

AIR-00008  1 

610-00022  1 

0016-0020 

1 

61 

••c 

AIR-00039  1 

GID-00043  1 

0041-0039 

1 

79 

AIR-00020  1 

610-00004  1 

0013-0032 

1 

119 

AIR-00037  1 
- + 

610-00018  1 

0006-0017 

1 

160 

Figure  19.  Example  of  the  Summary  Report 


executed  time  slice.  The  dialog  contains  a  push  button  which  allows  the  user 
to  stop  the  simulation  at  any  time. 

—  Run  Until:  This  choice  runs  the  simulation,  as  fast  as  possible,  to  a  specific 
goal  time.  The  user  is  prompted  for  the  goal  time,  which  must  be  in  the  future, 
and  the  simulation  is  executed  until  that  goal  time.  This  option  also  displays  a 
dialog  which  is  updated  after  each  time  slice.  The  user  may  terminate  the  run 
until  operation  at  any  time  by  pressing  a  stop  button  which  is  displayed  in  the 
dialog. 

•  Post  Process:  Contains  commands  to  execute  the  two  post-processing  options  of  the 
benchm2U'k.  The  choices  on  this  menu  are  described  below: 


—  View  Map:  This  choice  displays  a  hex  board  to  the  user  with  the  aircraft  and 
trucks  in  the  current  model  plotted  on  the  map.  The  map  may  be  left  up  and 
updated  by  a  user  using  an  update  push  button  or  it  may  be  dismissed  at  any 
time.  An  update  of  the  map  is  capable  of  responding  to  changes  in  the  scenario, 
including  a  change  in  the  hex  board  size. 

—  Generate  Report:  This  choice  generates  a  summary  report  to  the  text  file 
SIMJtEPORT.  If  the  file  already  exists,  it  is  overwritten.  A  sample  of  the 
stunmary  report  is  seen  in  Figure  19. 
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•  Help:  Contains  help  about  the  simulation  benchmark.  The  simulation  benchmark 
does  not  provide  a  full  help  system  due  the  large  amount  of  effort  required  to  imple¬ 
ment  such  a  system  in  the  current  version  of  OSF/Motif. 

5.2.2  Design  of  the  Prohlem  Domain  Component.  This  section  describes  our 
design  for  the  problem  domain  component  of  the  simulation  benchmark.  The  problem 
domain  component  consists  of  all  the  simulation  objects  required  by  the  simulation  bench¬ 
mark.  An  OOD  diagram  of  our  design  for  the  problem  domain  component  of  the  simulation 
benchmark  is  shown  in  Figure  20. 

The  model  class  is  a  container  class  which  holds  all  the  components  of  a  model.  Every 
instance  of  the  model  class  has  a  name  which  is  stored  in  the  name  attribute.  A  model 
contains  a  simulation  environment^  events,  and  logitems.  The  simulation  environment 
contains  the  active  and  passive  simulation  objects.  The  model’s  events  define  the  future 
actions  of  the  active  objects  in  the  simulation  environment.  The  logitems  record  informa¬ 
tion  which  occtured  during  the  simulation  for  later  reporting.  An  instance  of  the  model 
class,  with  all  its  parts,  defines  a  simulation  state.  The  model  stores  its  current  simula¬ 
tion  time  in  the  SimTime  attribute  and  is  able  to  advance  simulation  time  by  dispatching 
events  to  the  zK:tive  objects  in  the  environment.  Simulation  time  in  a  model  is  advanced 
using  the  RunUntilSimGoalTime  service.  This  service  executes  a  sequence  of  events  in 
time  order  until  the  goal  time  is  reached.  The  concepts  of  time  slice  and  time  ratio  are 
not  imderstood  by  the  model  cleiss;  it  is  only  able  to  advance  to  a  specific  futmre  time. 
The  Schedule  service  of  the  model  class  allows  active  object  in  the  model’s  environment 
to  place  schedule  future  events  for  the  model.  The  AddEnvironmentMember  service  allows 
new  objects  to  be  added  to  the  model’s  environment. 

All  objects  which  are  part  of  a  model’s  environment  are  subclassed  firom  the  environ¬ 
ment  class.  The  environment  class  is  a  virtual  class.  That  is,  no  instances  of  the  class  are 
allowed  to  be  created.  The  main  purpose  of  the  environment  class  is  to  force  every  subclass 
to  implement  two  services:  a  clone  service  and  a  query  service.  The  clone  service  creates 
an  exact  copy  of  the  object.  This  service  is  required  to  allow  distinct  copies  of  models  to  be 
created.  The  query  service  provides  a  means  for  other  objects  to  obtain  information  about 
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Figure  20.  Simulation  Benchmark  Problem  Domadn  OOD  Diagram 
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a  simulation  object.  Every  subclass  of  the  environment  class  must  answer  two  queries,  but 
most  support  more.  The  first  required  query  is  a  query  for  the  class  name  and  the  second 
is  a  query  for  the  object  instance’s  name.  Simulation  objects  which  are  subclassed  directly 
from  the  environment  class  are  passive  objects.  For  an  active  object  to  be  created,  it  must 
be  subclassed  from  the  player  class. 

The  player  class  is  a  virtual  class  which  defines  the  additional  services  required  by 
active  simulation  objects.  The  player  class  requires  all  its  subclasses  to  define  an  Execute 
service.  The  execute  service  is  called  for  an  object  when  an  event  for  that  object  is 
dispatched  to  it.  Each  event  in  the  model  contmns  a  reference  to  the  player  object  for 
which  it  is  designated;  this  reference  is  labeled  ToPlayer  in  Figure  20. 

The  HexBoard  class  is  the  only  non-active  simulation  object  in  the  benchmark  model. 
This  class  contains  three  attributes:  name,  width,  and  height.  The  class  allows  queries 
on  its  width  and  height  attributes  in  addition  to  the  two  basic  queries  required  the 
environment  class. 

The  Aircraft  and  Truck  classes  are  active  simulation  objects  in  the  benchmark  model. 
These  classes  are  subclassed  from  the  player  class.  These  classes  contain  the  attributes 
defined  for  them  in  the  specification  and  allow  queries  on  any  of  their  attributes.  The 
execute  service  of  the  aircraft  class  wiU  accept  move  and  search  events,  while  the  execute 
service  of  the  truck  class  will  accept  only  move  events. 

The  Event  class  stores  the  futmre  2u:tions  of  all  the  simulation’s  active  objects.  Each 
event  instance  contains  a  single  action  for  one  object.  For  example,  an  event  could  tell 
aircraft  AIR-00345  to  search.  The  event  msuntains  a  reference  to  the  object,  shown  as 
ToPlayer  in  Figure  20.  An  event  is  dispatched  to  its  object  when  it  becomes  the  next  time 
ordered  event.  Events  are  dispatched  by  the  RunUntilSimGoalTime  service  of  the  model 
class.  When  an  event  is  dispatched,  the  Execute  service  for  the  object  it  references  is  called 
with  the  event’s  message  attribute  passed  as  a  parameter  to  the  call. 

The  Logitem  class  is  used  to  store  a  log  of  all  the  trucks  which  are  foimd  during 
aircraft  searches.  The  simulation  benchmark  requires  that  this  information  be  reported  on 
the  summary  report  which  is  generated  from  the  post-processing  portion  of  the  simulation 


83 


system.  For  each  instance  of  the  class,  the  information  required  by  the  summary  report  is 
included. 

5.2.3  Design  of  the  Database  Interface.  The  database  interface  portion  of  our 
design  interfaces  the  Motif  user  interface  with  the  persistent  objects  which  meike  up  our 
simulation  model.  It  is  a  good  principle  of  user  interface  design  to  separate  the  interface 
from  the  application  [17].  To  accomplish  this  we  developed  an  interface  package  which  the 
user  interface  could  call  to  perform  actions  on  the  object-oriented  database.  This  avoided 
placing  DBMS  calls  in  the  same  module  with  Motif  calls. 

A  possibly  confusing  portion  of  our  design  is  the  simulation  executive.  The  simula¬ 
tion  executive  controls  the  execution  of  the  simulation  through  time.  In  our  design,  the 
functions  of  the  simulation  executive  are  shared  by  the  user  interface  and  the  model.  The 
model  provides  the  basic  ability  to  simulate  to  a  specified  goal  time,  and  the  user  interface 
controls  more  complex  simulation  executions,  such  as  real  time  simulation.  This  was  re¬ 
quired  due  to  the  tight  coupling  of  the  user  interface  with  simulation  control.  For  example, 
the  interface  is  required  to  display  the  current  simulation  time  as  each  time  slice  is  exe¬ 
cuted  in  the  simulation.  We  leave  it  to  futiure  research  to  better  integrate  the  simulation 
executive  into  a  graphical  simulation  environment. 

Our  design  for  the  simulation  benchmark  is  now  complete.  Due  to  the  nature  of 
the  benchmark,  no  design  is  necessary  for  the  task  management  component.  In  the  next 
section  we  describe  our  implementations  of  the  simulation  benchmark. 

5.3  Simulation  Benchmark  Implementations 

This  section  describes  our  implementations  of  the  simulation  bendunark.  All  of  our 
implementations  were  built  fi'om  the  design  described  in  the  previous  section.  We  created 
an  implementation  of  the  simulation  benchmark  for  the  ObjectStore  DBMS,  and  we  also 
created  a  non-persistent  version  of  the  benchmark.  The  non-persistent  version  does  not 
provide  most  of  the  capability  required  by  the  benchmark  but  does  provide  a  measure  of 
how  fast  the  simulation  would  execute  if  it  was  allowed  to  execute  exclusively  in  memory. 
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A  screen-print  of  our  simulation  benchmark  implementation  is  shown  in  Figure  21.  This 
figure  shows  the  ObjectStore  DBMS  version  of  the  benchmark. 

5.5.1  ObjectStore  Implementation.  The  first  implementation  of  the  simulation 
was  done  using  the  ObjectStore  DBMS.  With  our  experience  developing  the  OOl  bench¬ 
mark  in  ObjectStore,  the  implementation  was  reasonably  simple.  ObjectStore  provided 
inheritance  in  the  same  manner  as  the  C-i-+  programming  language  and  the  hierarchy 
required  by  our  simulation  benchmark  design  was  not  difficult  to  implement. 

5.3.2  Non-Persistent  Implementation.  The  non-persistent  implementation  of  the 
simulation  benchmark  was  developed  from  the  ObjectStore  version  of  the  benchmark.  Due 
to  the  close  ties  between  ObjectStore  and  the  C-f-f-  language,  we  were  able  to  remove  all  the 
database  commands  and  execute  the  program  as  a  stand  alone  C-l— I-  program.  The  non- 
persistent  version  of  the  benchmark  is  interesting  because  it  represents  the  current  typical 
method  of  simulation  execution  (where  all  the  simulation  data  structures  execute  only 
in  memory).  The  non-persistent  version  of  the  benchmark  is  not  a  valid  implementation 
of  the  benchmark  because  it  provides  no  support  for  multi-user  use,  nor  for  saving  the 
simulation  results.  It  does  provide  us  with  a  yardstick  to  measure  the  performance  cost 
for  the  eidditional  functionality  of  an  object-oriented  DBMS. 

5.3.3  Itasca  and  Matisse.  We  did  not  complete  implementations  of  the  simulation 

benchmark  for  Itasca  or  Matisse  due  to  time  constraints. 

5.3.4  Multiple  Object  Model  Problem.  Using  Motif  in  the  C-i-f  programming 
language  along  with  an  object-oriented  DBMS  required  an  understanding  of  severaJ  differ¬ 
ent  object  models.  The  C-l— I-  programming  language  provides  an  object  model.  The  Xt 
toolkit  (the  basis  for  Motif  programming)  uses  a  different  object  model  based  upon  the 
use  of  coding  conventions.  This  plethora  of  object  models  adds  unnecessary  complexity 
to  the  task  of  program  development  and  made  our  task  of  implementing  the  simulation 
benchmark  more  time  consuming  than  anticipated. 
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5.4  Summary 


This  chapter  has  described  the  simulation  benchmark  developed  for  this  research. 
The  specification  for  the  simulation  benchmark  was  described.  Then  our  design  for  the 
benchmark  was  given.  Finally,  each  implementation  we  developed  for  the  benchmark 
was  described.  The  next  chapter,  Chapter  VI,  examines  our  results  from  running  the 
benchmarks. 
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VI.  Benchmark  Results  Analysis 


This  chapter  examines  our  results  of  the  OOl  benchmark  and  the  simulation  bench¬ 
mark.  The  configuration  used  for  both  benchmarks  is  shown  in  Table  4.  For  all  our 
benchmskrk  runs,  two  Sun  SPARCstation  2  workstations  were  used.  The  first  workstation 
acted  as  the  database  server  and  the  other  workstation  as  a  client.  On  the  server  m2u:hine, 
the  first  hard  disk  held  the  operating  system,  the  second  held  the  database  software,  and 
the  third  held  the  benchmark  databases.  On  the  client  workstation,  the  first  hard  disk 
held  the  operating  system  and  the  second  held  the  ObjectStore  client  software.  Neither 
Itasca  nor  Matisse  required  any  software  to  be  on  the  client  machine  (except  the  bench¬ 
mark  program).  We  first  present  our  results  for  the  OOl  benchmark  and  then  our  results 
for  the  simulation  benchmark. 

6.1  OOl  Benchmark  Results 

In  this  section  we  examine  our  results  from  the  OOl  benchmark.  The  OOl  bench¬ 
mark  was  implemented  on  the  Itasca,  Matisse,  and  ObjectStore  object-oriented  DBMSs  as 
described  in  Chapter  IV.  We  ran  the  benchmark  in  the  following  configurations: 

•  Small  remote  database 

•  Small  local  database 

•  Small  remote  database  with  no  locedity  of  reference 

•  Small  local  database  with  no  locality  of  reference 

•  Large  remote  database 

•  Large  local  database 

Only  the  small  remote  and  the  large  remote  configurations  are  required  by  the  OOl  bench¬ 
mark  specification.  The  other  configurations  are  optional  [9]. 

For  each  benchmark  configuration  (such  as  the  small  local  database  configuration), 
five  complete  benchmark  runs  were  made.  The  average  of  the  five  benchmark  res\ilts  is 
reported.  Complete  data  for  our  OOl  benchmark  results  is  contained  in  Appendix  B  and 
a  statistical  analysis  of  our  results  is  presented  in  Appendix  C. 
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Table  4.  AFIT  Benchmark  System  Configuration 


Server  Mcichine 

prowler 

Sun  SPARCstation  2 

Operating  System 

Sun/OS  4.1.3 

Memory 

48  Mbytes 

Swap 

96  Mbytes 

Hard  Disks 

200  Mbyte 

200  Mbyte 

200  Mbyte 

Client  Machine 

doc 

Sun  SPARCstation  2 

Operating  System 

Sun/OS  4.1.3 

Memory 

48  Mbytes 

Swap 

96  Mbytes 

Hard  Disks 

200  Mbyte 

200  Mbyte 

DBMS  Softweire 

Itasca 

Version  2.2 

Matisse 

Version  2.2.0 

ObjectStore 

Version  2.0.1 

The  next  four  sections  describe  our  results  for  the  small  database  configuration. 
The  small  database  for  the  001  benchmark  contains  20,000  parts  and  60,000  connections 
between  those  parts.  All  the  reported  values  for  elapsed  time  in  our  results  are  in  seconds. 

6.1.1  Small  Remote  Database  Results.  The  most  important  results  for  the  001 
benchmark  are  shown  in  Table  5,  eind  graphically  in  Figiure  22.  The  001  benchmark  total 
is  identified  in  Table  5  by  L+T+I,  which  represents  the  sum  of  the  lookup,  traversed, 
and  insert  measurements.  ObjectStore  is  clearly  the  best  performer.  The  small  remote 
databeise  is  the  configuration  which  Cattell  states  most  object-oriented  DBMS  applications 
require  [9].  This  configuration  models  a  network  client  working  with  a  database  server 
located  on  a  remote  computer  system.  ObjectStore  is  972%  faster  than  Itasca  and  243% 
faster  than  Matisse  in  our  cold  results.  For  the  wsurm  residts,  ObjectStore  is  9551%  faster 
than  Itasca  zmd  2391%  faster  than  Matisse.^  An  analysis  of  our  results,  which  is  detadled 
in  Appendix  C,  shows  that  they  are  statistically  significant  at  the  a  =  0.05  level. 

*To  calculate  that  database  A  is  n%  faster  thaa  database  B,  we  used  the  following  formula:  = 

1  -t-  This  definition  is  documented  by  Hennessy  and  Patterson  in  [18]. 
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l^ble  5.  001  Benchmark  Resuits  for  Small  Remote  Databeuse 


DBMS 

Lookup 

Cold  Warm 

IVaversal 

Cold  Warm 

Insert 

Cold  Warm 

L+T+I 

Cold  Warm 

Itasca 

277.278 

213.162 

347.040 

251.136 

134.765 

129.935 

759.083 

594.233 

Matisse 

125.982 

68.448 

53.111 

35.387 

64.076 

49.507 

243.169 

153.382 

ObjectStore 

28.191 

1.239 

26.322 

1.734 

16.285 

3.184 

70.798 

6.157 
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We  were  surprised  at  the  large  differences  between  the  benchmark  results.  Cattell 
proposed  a  goal  of  about  10  seconds  for  each  measurement  (30  seconds  for  the  L+T+I 
benchmark  total)  [9].  ObjectStore  meets  this  objective  for  its  warm  results.  The  Itasca 
and  Matisse  results  are  much  to  slow  to  meet  this  goal  even  in  the  warm  results,  although 
Matisse  is  closer  than  Itasca. 

There  is  a  much  larger  change  between  the  cold  and  warm  results  for  ObjectStore  than 
for  Itasca  and  Matisse.  From  Table  5  we  can  calculate  that  the  warm  results  are  28%  faster 
than  the  cold  results  for  Itasca,  59%  faster  than  the  cold  results  for  Matisse,  and  1050% 
faster  than  the  cold  results  for  ObjectStore.  We  believe  that  this  is  due  to  the  virtual 
memory  mapping  supported  by  ObjectStore.  Once  data  is  read  from  the  ObjectStore 
server  it  is  accessed  at  memory  speeds.  Another  reason  for  the  difference  is  the  benchmark 
database  size.  The  benchmark  database  is  smaller  (see  Figure  13)  in  ObjectStore  than  it 
is  in  Itasca  or  Matisse.  If  an  application’s  database  takes  up  less  total  storage  then  more 
of  it  can  be  cached  in  a  fixed  amount  of  memory. 

The  next  section  examines  results  for  the  small  database  size,  but  with  the  benchmark 
program  nm  locally  rather  than  remotely. 

6,1.2  Small  Local  Database  Results.  After  examining  the  results  for  the  remote 
case,  it  is  interesting  to  examine  results  for  a  local  client.  In  the  local  configuration, 
the  client  executes  on  the  same  computer  as  the  database  server.  Our  results  for  this 
configuration  eire  shoMm  in  Table  6,  and  graphically  in  Figure  23.  Although  we  do  see  an 
improvement  in  the  results  for  Itasca  and  Matisse,  ObjectStore  remains  the  clear  winner. 
ObjectStore  is  734%  faster  than  Itasca  and  199%  faster  than  Matisse  in  our  cold  results. 
For  the  warm  results,  ObjectStore  is  7797%  faster  than  Itasca  and  1784%  faster  than 
Matisse.  The  results  are  statistically  significant  at  the  a  =  0.05  level. 

The  performance  of  Itasca  and  Matisse  improved  in  the  local  configuration.  Itasca 
was  20%  faster  for  the  cold  res\ilts  and  12%  faster  for  the  warm  results.  Matisse  was  7% 
faster  for  the  cold  results  and  21%  faster  for  the  warm  results. 

In  a  very  smprising  result,  ObjectStore  was  faster  in  the  remote  configuration  than 
in  the  local  configuration.  This  was  true  for  all  the  001  benchmark  measures.  Our  results 
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'Table  6.  001  Benchmark  Results  for  Small  Local  Database 


DBMS 

Lookup 

Cold  Warm 

TVaversal 

Cold  Warm 

Insert 

Cold  Warm 

L+T+I 

Cold  Warm 

Itasca 

251.064  200.107 

248.967  211.769 

130.855  120.430 

630.886  532.306 

Matisse 

91.652  36.058 

70.499  41.030 

64.279  49.899 

226.431  126.987 

ObjectStore  28.859 


29.029 


17.791 
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l^ble  7.  001  Benchmark  Results  for  Small  Remote  Database  (NLOR) 


DBMS 

Lookup 

Cold  Warm 

lYaversal 

Cold  Warm 

Insert 

Cold  Warm 

L-l-T-l-I 

Cold  Warm 

Itasca 

252.481 

203.955 

380.217 

301.063 

160.953 

838.700 

665.972 

Matisse 

146.186 

78.955 

120.816 

63.672 

61.556 

355.342 

204.182 

ObjectStore 

29.765 

1.250 

39.574 

1.766 

33.045 

9.919 

102.384 

12.936 

show  the  remote  configuration  was  6.9%  faster  than  the  local  configuration  for  the  cold 
time  and  9.49%  faster  than  the  local  configuration  for  the  warm  time.  However,  when  we 
examined  these  results  statistically  (see  Appendix  C),  we  found  that  there  is  not  enough 
evidence  to  prove  that  this  finding  is  significant  at  the  a  =  0.05  level.  It  is  possible  that 
competition  between  the  client  program  (our  benchmark)  and  the  database  server  for  the 
same  computer’s  resources  may  have  caused  this  result. 

6.1.3  Small  Remote  Database  (NLOR)  Results.  The  001  benchmark  requires 
that  the  database  be  built  with  a  large  degree  of  locality  of  reference.  90%  of  the  con¬ 
nections  between  parts  must  be  randomly  connected  to  1%  of  the  closest  parts.  Parts  are 
close  if  they  have  numerically  similar  part  identifiers  [9].  To  further  investigate  the  per¬ 
formance  of  our  three  databases,  we  removed  this  requirement.  Our  results  for  the  small 
local  database  with  no  locality  of  reference  (NLOR)  database  configuration  are  shown  in 
'Dible  7,  zmd  graphically  in  Figure  24.  ObjectStore  is  still  the  top  performer,  even  without 
the  locality  of  reference  requirement.  ObjectStore  is  719%  faster  than  Itasca  and  247% 
faster  than  Matisse  in  our  cold  resiilts.  For  the  warm  results,  ObjectStore  is  5048%  faster 
than  Itasca  and  1478%  faster  than  Matisse.  The  restilts  are  statistically  significant  at  the 
a  =  0.05  level. 

ObjectStore,  in  the  warm  results,  is  the  most  affected  by  the  loss  of  locality  of 
reference.  ObjectStore  is  110%  faster  with  locality  of  reference,  Itasca  is  12%  faster  with 
locality  of  reference,  and  Matisse  is  33%  faster  with  locality  of  reference.  For  the  cold 
results,  ObjectStore  is  45%  faster  with  locality  of  reference,  Itasca  is  10%  faster  with 
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TVible  8.  001  Benchmark  Results  for  Small  Local  Database  (NLOR) 


DBMS 

Lookup 

Cold  Warm 

IVaversal 

Cold  Warm 

Insert 

Cold  Warm 

L-i-Td-I 

Cold  Warm 

Itasca 

234.236 

197.707 

342.590 

270.023 

188.591 

150.635 

765.417 

618.365 

Matisse 

107.941 

31.779 

145.648 

58.183 

98.353 

61.876 

351.942 

151.839 

ObjectStore 

33.249 

1.312 

44.217 

1.792 

39.102 

11.914 

116.568 

15.017 

locality  of  reference,  and  Matisse  is  46%  faster  with  locality  of  reference.  Itasca  is  affected 
the  least  by  the  loss  of  locality  of  reference. 


6.1.4  Small  Local  Database  (NLOR)  Results.  Our  results  for  the  small  local 
database  with  no  locality  of  reference  database  configuration  are  shown  in  Table  8,  and 
graphically  in  Figure  25.  ObjectStore  is  the  top  performer.  ObjectStore  is  557%  faster  than 
Itasca  and  202%  faster  than  Matisse  in  otur  cold  results.  For  the  warm  results,  ObjectStore 
is  4018%  faster  than  Itasca  and  911%  faster  than  Matisse. 

We  experienced  problems  with  Itasca  in  this  benchmark  configuration.  The  database 
could  not  get  enough  system  memory  to  execute  the  benchmark  properly.  We  were  only 
able  to  complete  a  single  benchmark  run  and  obtain  partial  results  for  the  other  fotir  runs 
(see  Appendix  B).  Itasca  support  was  imable  to  help  us  solve  this  problem  since  it  was  a 
limitation  of  our  workstations.  Hence,  while  we  were  able  to  show  statistical  significance 
for  the  differences  in  the  Matisse  and  ObjectStore  results  to  the  a  =  0.05  level,  this  was 
not  possible  for  Itasca. 

This  concludes  our  results  for  the  small  database  configuration.  A  summary  of  our 
001  benchmark  results  is  shown  in  Figure  9  for  the  small  database  configurations.  Figure  9 
shows  all  our  results  in  terms  of  what  percentage  ObjectStore  is  faster  than  Itasca  or 
Matisse.  The  next  two  sections  describe  our  results  for  the  large  database  configturation. 
The  large  database  contains  200,000  parts  and  600,000  connections  between  those  parts. 

6.1.5  Large  Remote  Database  Results.  Our  results  for  the  large  database  con¬ 
figurations  are  not  as  complete  as  those  obtiuned  for  the  small  database  configurations. 
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Figure  25.  001  Bendunark  Results  for  Small  Local  Database  (NLOR) 
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IVible  9.  Summary  of  001  Bendimark  Small  Database  Results 


Ob  jectStore  versus 


Benchmark  Configtiration 

Itasca 

Matisse 

Small  Remote 

Lookup 

Traversal 

Insert 

884%  faster 
1218%  faster 
728%  faster 

347%  faster 
102%  faster 
293%  faster 

L+T+I 

972%  faster 

243%  faster 

warm 

Lookup 

Traversal 

Insert 

17104%  faster 
14383%  faster 
3981%  faster 

5428%  faster 
1941%  faster 
1455%  faster 

L+T+I 

2391%  faster 

Small  Local 

cold 

Lookup 

Ikaversal 

Insert 

770%  faster 
758%  faster 
636%  faster 

218%  faster 
143%  faster 
261%  faster 

L+T+I 

734%  faster 

199%  faster 

warm 

Lookup 

Traversal 

Insert 

15607%  faster 
9508%  faster 
3590%  faster 

2730%  faster 
1762%  faster 
1429%  faster 

L+T+I 

7797%  faster 

1784%  faster 

Small  Remote 
(NLOR) 

cold 

Lookup 

Traversal 

Insert 

748%  faster 
861%  faster 
523%  faster 

391%  foster 
205%  faster 
167%  faster 

L+T+I 

719%  faster 

247%  faster 

warm 

Lookup 

Traversal 

Insert 

16216%  faster 
16948%  faster 
1523%  faster 

6216%  faster 
3505%  faster 
521%  faster 

L+T+I 

5048%  faster 

1478%  faster 

Small  Local 
(NLOR) 

cold 

Lookup 

Traversal 

Insert 

604%  faster 
675%  faster 
382%  faster 

225%  faster 
229%  faster 
151%  faster 

L+T+I 

557%  faster 

202%  faster 

warm 

Lookup 

Traversal 

Insert 

14969%  faster 
14968%  faster 
1164%  faster 

2322%  faster 
3147%  faster 
419%  faster 

L+T+I 

4018%  faster 

911%  faster 
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We  encountered  problems  ¥nth  Itasca  and  Matisse  when  attempting  to  build  the  large 
benchmark  databases  and  were  unable  to  complete  large  database  measurements  on  the 
Itasca  DBMS. 

Itasca,  as  was  seen  in  the  small  local  NLOR  configuration,  did  not  have  enough 
memory  to  load  the  large  database.  The  Itasca  database  server  would  consume  system 
memory  tmtil  the  benchmark  program,  which  was  using  memory  firom  the  same  system, 
would  fail.  In  the  attempts  we  made  to  build  the  large  database  in  Itasca,  fidlure  would 
occur  after  the  program  was  allowed  to  run  for  about  27  hours.  We  estimated  that  a 
remote  build  of  the  large  database  would  have  taken  about  10  days  but  did  not  attempt 
such  a  build  due  to  the  low  probability  of  success. 

We  encountered  a  very  different  problem  with  the  Matisse  DBMS.  Matisse  writes 
data  to  the  disk  without  attempting  to  compact  the  data  into  the  smallest  space.  There¬ 
fore,  large  amounts  of  disk  space  are  consumed  during  the  creation  of  the  OOl  benchmark 
database.  To  compact  the  disk  space,  a  program  called  mts-collecLversions  must  be  ex¬ 
ecuted  on  the  database.  For  the  small  database  configuration,  we  allowed  the  database 
to  be  built  and  then  ran  the  collect  versions  program.  But  for  the  large  database,  we  did 
not  have  enough  disk  space  to  follow  this  procedure.  We  worked  with  Matisse  support 
to  come  up  with  a  solution  and  decided  the  only  solution  was  to  manually  monitor  the 
database  load.  For  the  load,  we  allocated  the  entire  200  Mbytes  of  our  database  hard  disk 
to  Matisse.  The  load  was  run  imtil  90%  of  the  Matisse  database  was  filled,  then  the  loetd 
program  was  stopped  (using  the  control-Z  command).  With  the  load  stopped,  the  collect 
versions  program  was  executed  to  compact  the  disk  space  in  use  by  Matisse.  Using  this 
method  we  were  able  to  build  the  large  benchmark  database  in  Matisse  in  about  1.5  weeks. 
We  were  then  able  to  execute  the  OOl  benchmark  measures. 

We  encoimtered  no  problems  with  the  ObjectStore  database  for  the  large  database 
measures.  We  examine  the  remote  results  in  this  section  and  the  local  results  in  the  next. 

Our  results  for  the  large  remote  database  configuration  are  shown  in  Table  10,  and 
graphically  in  Figure  26.  ObjectStore  is  the  best  performer,  but  not  by  as  wide  a  margin 
as  in  the  small  database  configuration.  ObjectStore  is  37%  faster  than  Matisse  in  our  cold 
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l^ble  10.  001  Benchmark  Results  for  Large  Remote  Database 


DBMS 

Lookup 

Cold  Warm 

TVaversal 

Cold  Warm 

Insert 

Cold  Warm 

L+T+I 

Cold  Warm 

Matisse 

245.811  216.350 

231.703  214.405 

119.756  110.767 

577.456  535.937 

ObjectStore 

121.811  64.019 

194.698  118.016 

104.126  61.330 

420.636  243.364 

Table  11.  001  Benchmark  Results  for  Large  Local  Database 


DBMS 

Lookup 

Cold  Warm 

TVaversal 

Cold  Warm 

Insert 

Cold  Warm 

L+T+I 

Cold  Warm 

228.080  215.053 

229.644  228.602 

106.308  112.559 

546.682  552.772 

ObjectStore 

105.463  56.753 

183.473  155.877 

70.533  45.905 

359.469  258.535 

results  and  120%  faster  than  Matisse  in  our  warm  results.  The  results  are  statistically 
significant  at  the  a  =  0.05  level. 

For  ObjectStore  the  warm  results  are  only  73%  faster  than  the  cold  results.  This 
difference  is  much  less  than  the  1050%  improvement  seen  in  the  small  database  results.  For 
Matisse  the  warm  results  are  only  8%  faster  than  the  cold  results.  Again,  this  difference  is 
much  less  than  the  59%  improvement  seen  in  the  small  database  results.  We  believe  that 
this  difference  is  due  to  the  larger  size  of  the  database  working  set.  In  the  small  database 
configuration  ObjectStore  must  have  been  able  to  hold  most  (or  all)  of  the  benchmark’s 
working  set  in  the  database  cache,  but  was  not  able  to  do  this  for  the  large  database 
configuration. 

6.1.6  Large  Local  Database  Results.  Oxtr  results  for  the  large  local  database 
configuration  are  shown  in  Table  11,  and  graphically  in  Figure  27.  ObjectStore  is  again 
the  best  performer.  ObjectStore  is  52%  faster  than  Matisse  in  our  cold  results  and  114% 
faster  than  Matisse  in  oiir  warm  results.  Tlie  results  are  statistically  significant  at  the 
a  =  0.05  level. 

As  in  the  large  remote  configuration  results,  we  see  a  much  smaller  difference  between 
the  cold  and  warm  results  than  for  the  small  database  results.  For  example,  the  warm 
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Figure  26.  OOl  Benchmark  Results  for  Large  Remote  Database 
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results  for  ObjectStore  are  39%  faster  than  the  cold  resiilts.  For  the  same  configuration  of 
the  small  database  they  were  1103%  faster. 

A  summary  of  our  OOl  benchmark  results  is  shown  in  Figure  12  for  the  large  database 
configurations.  Figure  12  shows  all  our  results  in  terms  of  what  percentage  ObjectStore  is 
faster  than  Matisse. 

6.1.7  Benchmark  Database  Load  Times.  This  section  examines  the  time  required 
to  build  the  OOl  benchmark  databases.  The  build  times  and  database  sizes  for  the  small 
database  are  shown  in  Table  13.  Note  that  for  our  three  DBMSs,  there  is  a  considerable 
difference  in  both  the  amount  of  time  required  emd  the  amount  of  disk  space  necessary. 
The  Matisse  database  required  a  much  larger  amount  of  disk  space  dining  the  build,  but 
was  compacted  down  to  the  size  shown  in  Table  13  before  the  benchmark  runs  were  made. 
We  noted  the  large  amoimt  of  space  overhead  required  by  Itasca  and  Matisse.  Cattell 
estimated  a  size  between  4  and  5  Mbytes  for  the  small  database.  Only  ObjectStore  was 
within  this  range  [9]. 

The  build  times  and  database  sizes  for  the  large  database  are  shown  in  Table  14. 

6.1.8  Matisse  Version  Collection  Results.  To  determine  the  effect  of  version  col¬ 
lection  on  the  Matisse  DBMS,  we  varied  the  delay  allowed  for  version  collection  between 
benchmark  runs.  For  all  our  benchmark  configurations  using  Matisse,  we  started  the  ver¬ 
sion  collection  program  after  a  run.  Then,  we  immediately  started  the  next  run.  However, 
for  the  Small  Local  database  configuration  and  the  Small  Remote  (NLOR)  database  con¬ 
figuration,  we  also  tested  with  a  200  second  delay  to  allow  version  collection.  The  results, 
which  were  inconclusive,  make  it  apparent  that  the  version  collection  program  does  not 
have  priority  over  an  application  program  (see  Appendix  B).  Matisse  support  confirmed 
this  presumption  1[^  informing  us  that  the  version  collection  program  runs  with  a  very  low 
priority  inside  the  Matisse  server. 

6.1.9  Reverse  Traversal  Results.  The  OOl  benchmark  requires  that  a  reverse 
traversal  be  run  as  one  of  the  measures.  We  indeed  executed  the  reverse  traversid  on  all  our 
configurations  but  foimd  that  the  results  varied  tremendously.  This  variation  was  due  to 
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Table  12.  Summary  of  001  Benchmark  Large  Database  Results 


Benchmark  Configuration 

ObjectStore  versus 
Matisse 

Large  Remote 

cold 

Lookup 

IVaversal 

Insert 

102%  faster 
19%  faster 
15%  faster 

L+T+I 

37%  faster 

warm 

Lookup 

TVavei-sal 

Insert 

238%  faster 
82%  faster 
81%  faster 

L+T+I 

120%  faster 

Large  Local 

cold 

Lookup 

Traversal 

Insert 

116%  faster 
25%  faster 
51%  faster 

L+T+I 

52%  faster 

warm 

Lookup 

T-aversal 

Insert 

279%  faster 
47%  faster 
145%  faster 

L+T+I 

114%  faster 

Table  13.  001  Benchmark  Load  Times  for  Small  Database 


Itasca 

Matisse 

Elapsed  Time 

61738.459 

9963.483 

102.481 

Database  Size  (in  Kbytes) 

12470.956 

17338.000 

4664.000 

Table  14.  001  Benchmark  Load  Times  for  Large  Database 


Matisse 

ObjectStore 

Elapsed  Time 

~1.5  Weeks 

23.130  Hours 

Database  Size  (in  Kbytes) 

171400.000 

45296.000 

103 


Table  15.  Summary  of  001  Benchmark  Results 


Benchm2u:k  Configuration 

ObjectStore  versus 

Itasca 

Matisse 

Small  Remote 

cold 

L-l-T-i-I 

972%  faster 

243%  faster 

warm 

L-l-T-H 

9551%  faster 

2391%  faster 

Small  Local 

cold 

L-l-T-l-I 

734%  faster 

199%  faster 

warm 

L-l-T-l-I 

7797%  faster 

1784%  faster 

Small  Remote 

cold 

L-lT-l-I 

719%  faster 

247%  faster 

warm 

l-ht-i-i 

5048%  faster 

1478%  faster 

Small  Local 

cold 

L-l-T-l-I 

557%  faster 

202%  faster 

warm 

L-l-T-l-I 

4018%  faster 

911%  faster 

Large  Remote 

cold 

L-l-T-l-I 

37%  faster 

warm 

L-l-T-l-I 

120%  faster 

Large  Local 

cold 

L-l-T-l-I 

52%  faster 

warm 

L-l-T-l-I 

the  random  nature  of  the  number  of  parts  found  in  a  reverse  traversal.  We  have  reported 
the  complete  results  for  the  reverse  traversal  in  Appendix  B  but  have  not  included  them 
in  this  chapter  because  there  was  no  statistical  significance  to  the  resiilts.  The  benchmark 
specification  recognizes  the  problems  with  the  reverse  traversal  measure  and  did  not  include 
it  in  the  benchmark  total. 

This  concludes  our  results  for  the  OOl  benchmark.  A  summary  of  our  001  bench¬ 
mark  results  is  shown  in  Figure  15.  In  the  next  section  we  present  our  results  for  the 
simiilation  benchmark. 


6.2  Simulation  Benchmark  Results 

This  section  examines  our  simulation  benchmark  results.  The  simulation  benchmark 
was  described  in  Chapter  V.  Two  implementations  of  the  simulation  benchmark  have  been 
completed  to  date:  a  persistent  version  using  the  ObjectStore  DBMS  and  a  non-persistent 
version  created  in  the  C-f-l-  programming  language. 

The  non-persistent  version  of  the  benchmark  is  not  a  valid  implementation  of  the 
benchmark  because  it  provides  no  transaction  model  and  is  not  persistent  (it  can  not  save 
any  data  to  disk).  What  the  implementation  does  show  is  the  performance  which  would 
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be  obtained  by  most  airrent  simulation  systems  (which  nm  exclusively  in  memory).  The 
performance  difference  between  the  non-persistent  version  of  the  benchmark  simulation 
and  the  ObjectStore  version  provide  a  yardstick  to  measure  the  performance  price  paid  for 
the  functionality  gains  provided  by  the  object-oriented  DBMS. 

The  next  two  sections  describe  the  quantitative  and  the  qualitative  results  for  the 
simulation  benchmark.  We  only  present  results  for  the  small  benchmark  database.  Mea¬ 
surements  for  the  large  database  configuration  were  not  done  due  to  time  constraints. 

6.2.1  Quantitative  Results.  For  each  benchmark  configuration  five  complete 
benchmark  runs  were  made.  The  average  of  the  five  benchmark  measures  is  reported. 
Complete  data  for  our  simulation  benchmark  results  is  contained  in  Appendix  D  and  a 
statistical  analysis  of  our  results  is  presented  in  Appendix  E.  A  summaury  of  our  results  is 
shown  in  IVible  16. 

The  ObjectStore  version  of  the  benchmark  performs  much  better  than  we  expected 
compared  to  the  non-persistent  version.  For  the  ObjectStore  version  of  the  benchmark, 
the  time  slice  defines  the  size  of  a  transaction  in  terms  of  simulation  time.  It  can  be  seen 
from  the  hour  run  results  in  Table  16  that  as  the  time  slice  is  increased,  the  performance 
loss  due  to  use  of  the  object-oriented  DBMS  decreases  until  it  becomes  almost  negligible. 
Note  that  the  non-persistent  version  of  the  benchmark  does  not  write  any  of  the  model 
data  to  the  disk,  so  additional  time  would  be  required  to  save  the  model  data  in  an  actual 
simulation  system.  A  graph  of  the  hour  run  results  by  the  time  slice  setting  is  shown  in 
Figme  28. 

The  ObjectStore  results  for  the  two  post-processing  measures  are  actually  faster 
than  the  non-persistent  version  of  the  benchmark.  This  is  not  surprising  because  the 
ObjectStore  DBMS  did  not  have  to  write  any  data  during  these  measures  (they  are  both 
read-only  transactions  on  the  database),  and  all  the  model  data  should  have  been  in  the 
DBMS’s  cache. 

6.2.2  Qualitative  Results.  Overall  we  found  the  ObjectStore  DBMS  to  provide 
a  good  platform  for  simulation  system  development.  The  DBMS  provided  a  large  amoimt 
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Tkble  16.  Simulation  Benchmark  Results  for  Small  Database 


Benchmark  Measure 

ObjectStore 

(Remote) 

ObjectStore 

(Local) 

Non-Persistent 

Model  Creation 

0.645 

0.633 

0.007 

Scenario  Creation 

4.184 

4.285 

1.603 

Hour  Run  (TS  =  60) 

576.894 

580.294 

392.738 

Hour  Rim  (TS  =  600) 

485.341 
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Figure  28.  Simulation  Benchmark  Hour  Run  Results  by  Time  Slice  Setting  for  Small 
Database 
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of  functionality  with  a  minimum  of  performance  overhead.  The  following  is  a  list  of  some 
of  the  functional  benefits  provided  by  the  ObjectStore  DBMS  to  simulation  systems: 

•  Similarity  to  the  C-h-h  Programming  Language:  Due  to  ObjectStore ’s  close  ties  to  the 
C++  programming  language,  the  ObjectStore  version  of  the  simulation  benchmtfk 
looks  very  much  like  a  C++  program.  The  transaction  model  and  the  declaration  of 
persistent  objects  are  the  only  major  differences.  This  is  only  a  benefit  if  a  simulation 
system  is  being  developed  in  C++.  It  is  a  drawback  if  another  language,  such  as 
Ada,  is  being  used  for  development. 

•  Motif  Interface:  The  ObjectStore  DBMS  interacted  very  well  with  the  Motif  user 
interface  developed  for  the  simulation  benchmark. 

•  Multi-User  Access  to  Model  Data:  ObjectStore  controlled  all  conctirrent  access  to 
the  database.  When  two  executing  versions  of  the  simiilation  benchmark  worked 
with  the  same  model,  we  encountered  no  consistency  problems.  For  example,  we 
were  able  to  execute  the  simulation  on  one  workstation  and  update  a  map  on  a 
remote  workstation.  To  provide  this  type  of  multi-user  concurrency  control  without 
the  object-oriented  DBMS  would  have  required  a  large  amount  of  code  (the  non- 
persistent  version  of  the  benchmark  did  not  allow  miilti-user  access).  Multi-user 
access  is  a  major  benefit  of  object-oriented  DBMS  use. 

•  Browser  Tool  Use:  The  ObjectStore  DBMS  provides  a  graphical  database  browser 
tool.  We  used  this  tool  to  examine  the  simulation  benchmark  database  in  an  ad-hoc 
fashion.  We  believe  that  graphical  browser  tools  are  useful  for  simulation  systems. 

6,3  Summari’ 

This  chapter  has  presented  our  results  for  the  001  benchmark  and  the  simula¬ 
tion  benchmark.  Chapter  Vll  presents  conclusions  and  recommendations  based  upon  orir 
benchmm'k  results. 
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VII.  Conclusions  and  Recommendations 

7.1  Overview 

In  this  thesis  we  studied  the  performance  of  the  Itasca,  Matisse,  and  ObjectStore 
object-oriented  DBMSs.  The  OOl  benchmark  was  developed  and  run  on  the  three  DBMSs. 
A  new  benchmark,  the  AFIT  Simulation  benchmark,  was  developed  and  implonented  on 
the  ObjectStore  DBMS.  In  this  chapter,  we  present  our  conclusions  based  upon  the  results 
presented  in  Chapter  VI  and  summarize  some  important  lessons  learned  about  bendunark- 
ing.  Finally,  recommendations  are  presented  for  future  research  in  object-oriented  DBMS 
performance. 

7.2  Conclusions 

This  section  presents  our  conclusions.  These  conclusions  are  based  upon  our  work 
with  the  three  commercial  object-oriented  DBMS  and  our  benchmark  results  which  were 
presented  in  Chapter  VI. 

•  ObjectStore  was  the  top  performer  on  the  OOl  benchmark.  In  the  most  critical 
benchmark  configuration,  the  small  remote  database,  ObjectStore  was  the  clear  win¬ 
ner.  ObjectStore  was  972%  faster  than  Itasca  and  243%  faster  than  Matisse  in  our 
cold  results,  and  was  9551%  faster  than  Itasca  and  2391%  faster  than  Matisse  in  our 
warm  results.  In  all  the  other  configurations  we  tested,  ObjectStore  was  the  fastest 
DBMS,  although  it  was  found  that  ObjectStore  was  the  most  sensitive  to  locality  of 
reference. 

•  There  is  wide  variation  in  the  performance  of  commercial  object-oriented  DBMSs.  In¬ 
vestigating  the  performance  of  an  object-oriented  DBMS  is  critical.  The  conunercial 
systems  available  today  are  by  no  means  a  commodity  item  (as  relational  systems 
are  becoming).  An  object-oriented  DBMS  may  have  all  the  functional  c^>ability 
reqiiired  an  application,  but  its  performance  may  be  too  slow. 

•  A  programming  language  interface  to  an  object-oriented  DBMS  should  be  closely  tied 
to  a  specific  language  or  not  tied  to  any  language  at  all.  It  proved  to  be  very  confusing 
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to  work  with  the  Itasca  C++  API  which  provided  a  blend  of  C++  functionality 
with  DBMSs  functionality.  We  did  not  examine  the  Itasca  Lisp  API  during  this 
research.  The  ObjectStore  programming  interface,  which  was  closely  tied  to  the 
C++  programming  language,  and  the  Matisse  programming  interface,  which  was 
not  closely  tied  to  a  specific  language,  were  much  easier  to  learn  and  work  with. 

•  Some  object-oriented  DBMSs  provide  sufficient  performance  to  allow  the  creation  of 
persistent  simulation  environments.  Our  results  on  the  simulation  benchmark  show 
that  for  some  areas  of  simulation,  the  loss  of  performance  is  acceptable  given  the 
large  gadns  in  functionality.  This  is  especially  true  if  concurrent  access  to  executing 
simulation  data  is  required,  such  as  animating  a  running  simulation. 

l.S  Le»$ona  Learned 

The  following  is  a  summary  of  the  valuable  lessons  learned  during  our  benchmarking 
effort: 

•  Use  Vendor  Customer  Support:  Without  the  aid  of  the  support  groups  from  Itasca, 
Matisse,  and  ObjectStore,  it  is  unlikely  we  would  have  solved  many  of  the  problems 
we  encoimtered.  The  current  implementations  of  object-oriented  DBMSs  are  very 
complex  pieces  of  system  software  which  are  difficult  to  understand,  especially  when 
working  with  several  of  them  at  the  same  time.  The  aid  of  vendor  customer  support 
is  essential  to  any  benchmarking  effort. 

We  recommend  sending  benchmark  source  code  through  vendor  customer  support 
several  times  becatise  they  are  often  very  busy  and  will  not  always  examine  the 
details  of  your  implementation  in  just  one  look. 

•  Plan  for  Large  Amounts  of  Data:  We  underestimated  the  large  amount  of  data  which 
would  be  generated  during  this  research  effort.  The  data  took  a  considerable  amount 
of  time  to  consolidate  and  present  in  an  understandable  format.  Some  up-front 
planning  of  the  steps  necessary  to  move  benchmark  data  from  the  program  output  to 
the  final  report  will  save  a  large  amo\mt  of  effort  and  confusion.  We  also  recommend 
that  the  process  be  automated  to  avoid  transcription  errors.  For  this  researdi  effort. 
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custom  C  programs  were  used  to  filter  the  bendunark  output  into  a  spreadsheet  [5] 
for  calculation.  Then,  a  second  custom  C  program  was  used  to  filter  the  calculated 
results  into  for  our  final  report. 

•  Limit  Source  Code  Tuning:  There  is  a  point  where  making  minor  dianges  to  the 
benchmark  source  code  to  improve  the  performance  is  not  effective.  A  large  amount 
of  time  can  be  spent  working  on  the  source  code  of  benchmark  implementations.  We 
recommend  planning  a  final  stop-work  date  for  all  the  benchmark  implementations 
and  working  with  the  vendors  to  get  the  best  product  out  by  that  time. 

•  Disk  Space:  One  of  the  recurring  problems  during  our  research  was  that  of  the 
database  disk  filling  up.  We  underestimated  the  amount  of  storage  necessary  to 
work  with  three  DBMS  systems  at  the  same  time.  We  recommend  determining  how 
much  space  will  be  necessary  to  hold  the  largest  database  required  for  a  benchmark, 
then  doubling  that  size.  The  additional  space  may  be  used  for  backups,  or  may  be 
needed  if  one  of  the  DBMSs  requires  more  disk  space  than  anticipated. 

•  Coding  Standards:  We  developed  a  set  of  coding  standards  at  the  very  start  of  om 
research.  These  standards  helped  provide  consistency  across  all  our  implementations. 
Consistency  can  help  to  ensm^  that  implementations  on  different  DBMSs  are  done 
fairly. 

7.4  Recommendations 

The  following  are  recommendations  for  areas  of  further  research  which  we  feel  are 
needed  in  object-oriented  DBMS  performance  and  simulation: 

•  A  larger  simulation  system  should  be  built  using  the  ObjectStore  DBMS.  Object- 
Store  consistently  performed  much  better  than  Itasca  or  Matisse  in  the  001  bench- 
muk  and  added  very  little  performance  overhead  compared  to  the  non-persistent 
version  in  the  simulation  benchmark.  ObjectStore  should  be  investigated  for  use  in 
an  actual  simulation  system. 

•  The  simulation  benchmark  should  be  extended  to  test  the  \ise  of  long  transactions 
and  version  management.  The  current  ObjectStore  implementation  of  the  simulation 
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benchmark  is  set  up  to  allow  multiple  versions  of  a  model,  but  the  user  interface  does 
not  provide  an  interface  to  this  functionality.  This  was  partially  due  to  time  con¬ 
straints  and  partially  due  to  a  lack  of  research  into  how  a  version  control  system  can 
be  used  by  a  simulation  system.  It  is  possible  that  the  version  management  system 
provided  by  object-oriented  DBMSs  to  support  long  transactions  could  be  useful  in 
the  simulation  domain  or  it  may  be  that  some  changes  to  the  version  management 
model  (which  was  developed  for  engineering  applications  such  as  CAD  and  CASE) 
need  to  be  made. 

7.5  Summary 

Our  OOl  benchmark  resvilts  for  three  commercial  object-oriented  DBMSs  show  that 
users  must  be  very  wary  when  planning  to  use  an  object-oriented  DBMS  for  any  applicar 
tion.  Even  though  the  DBMS  may  provide  the  functional  capabilities  required,  the  system 
may  not  provide  the  level  of  performance  required.  Although  our  simulation  benchmark 
showed  benefits  from  the  use  of  an  object-oriented  DBMS  in  the  simulation  domain,  further 
investigation  is  necessary,  especially  in  the  area  of  version  management. 
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Appendix  A.  Object-Oriented  Analysis  and  Design  Notation  Summary 

This  appendix  stuninarizes  the  object-oriented  analysis  and  design  notations  used 
during  this  research.  Figure  29  gives  a  sununary  of  the  Coad/Yourdon  object-oriented 
analysis  (OOA)  and  object-oriented  design  (OOD)  notations.  This  notation  was  presented 
in  the  books  Object-Oriented  Analysis,  and  Object-Oriented  Design  [10,  11]. 
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Coad/Yourdan  OOA/CX)D  Notation  Summary 


Class-A-Object 

Clas8-&-Ot^ect 

Attribute  1 
Attribute2 


Servioel 

Servjoe2 


Name  (top  section) 
Attributes  (middie  section) 


Services  (bottom  section) 


Class 

Class 

Attributel 

Attribute2 

Servicel 

Service2 


Generalization 


Gen-Spec  Structure 


Spedaiizationl  Specialization2 


Whole-Part  Stnicture 


Whole 


l,m  1,ml 


1  1 


Class-&-Object1 


1  Instance  Connection 


II  ClasS'&-Object2 


Sender 


Message  Connection 


Receiver 


Subject  or  Design  Component 
(may  be  expanded  or  collapsed) 


Note:  In  addHion,  Otject  Stale  Otagmms  wid  Seivice 
Chaitt  may  be  ueed  for  specifying  Seraioee. 
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Appendix  B.  Detailed  001  Benchmark  Results 

This  appendix  contains,  in  detail,  the  results  obtained  from  our  runs  of  the  OOl 
benchmark.  Benchmark  results  are  included  for  the  Itasca,  Matisse,  and  ObjectStore 
object-oriented  DBMSs. 

For  each  different  benchmark  configuration  (object-oriented  DBMS,  database  size, 
and  benchmark  variation)  five  complete  runs  of  the  OOl  benchmark  were  done.  Five  nms 
were  made  rather  than  one  to  try  to  obtain  a  better  picture  of  typical  DBMS  performwce, 
not  just  a  single  snapshot.  This  appendix  contsuns  both  raw  and  summary  results  for  all 
benchmark  configurations.  Benchmark  results  for  each  configuration  are  reported  using 
three  tables  and  two  charts.  The  tables  and  charts  are  described  below  in  the  order  which 
they  appear  for  each  benchmark  configuration. 

•  Summary  Results  Table:  This  table  reports  the  cold  and  warm  times  for  the  lookup, 
traversal,  and  insert  measmes  of  the  benchmark,  and  the  benchmark  total  (L-t-T-i-I). 
These  values  are  reported  for  the  five  complete  benchmark  runs.  The  average  and 
sample  standard  deviation  of  the  five  runs  is  given  at  the  bottom  of  the  table.  All 
the  times  reported  in  this  table  are  in  seconds. 

•  Benchmark  Results  Chart:  This  bar  chart  provides  a  clear  picture  of  the  average 
benchmark  results  (recorded  in  the  second  from  last  line  of  the  siunmary  results 
table).  It  provides  a  graphical  pictiure  of  the  percentage  of  time  spent  in  each  indi¬ 
vidual  measurement  compared  to  the  benchmark  total  (L-l-T-f-I).  It  is  important  to 
note  that  the  y-axis  scale  of  this  chart  is  different  for  each  benchmark  configuration, 
so  care  must  be  tzdcen  when  using  this  chart  for  comparisons. 

•  Normalized  Reverse  Traversal  Results  Table:  This  table  reports  the  reverse  traversal 
results  normalized  so  that  they  can  be  compeured  to  the  forward  traversal  results.  The 
layout  of  this  table  is  the  same  as  the  summary  results  table.  We  decided  to  separate 
the  reverse  traversal  results  into  a  separate  table  because  they  are  not  included  in  the 
benchmark  total  (L-l-T-fl).  The  formula  for  normalizing  the  reverse  traversal  results 
was  described  in  the  OOl  benchmark  specification  [9].  The  normalization  formula, 
where  Trt  normaUxed  represents  the  normalized  reverse  traversal  result,  is  shown  in 
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TVible  17.  OOl  Benchm2irk  Load  Times  for  Small  Database 


Itasca 

Matisse 

ObjectStore 

Elapsed  Time 

61738.459 

9963.483 

102.481 

Database  Size  (in  Kbytes) 

12470.956 

17338.000 

4664.000 

Table  18.  OOl  Benchmark  Load  Times  for  Large  Database 


Matisse 

ObjectStore 

Elapsed  Time 

~1.5  Weeks 

83266.697 

Database  Size  (in  Kbytes) 

171400.000 

45296.C00 

Equation  1  below. 

Trt  normalized  =  '^rt'^  (1) 

rlrt 

In  Equation  1  Trt  is  the  elapsed  time  measured  and  Nrt  is  the  niunber  of  parts 
actually  foimd  in  a  single  reverse  traversal  measure.  Nft  is  the  number  of  parts 
found  in  a  single  forward  traversal  measure,  which  for  the  OOl  benchmark  is  always 
3,280  parts.  All  the  times  reported  in  this  table  are  in  seconds. 

•  Raw  Benchmark  Results  Table:  This  table  reports  the  raw  results  for  the  five  bench¬ 
mark  runs.  In  fact,  this  is  the  exact  output  given  by  our  benchmark  program  and  was 
automatically  filtered  into  the  form  presented  in  this  appendix  to  avoid  transcription 
errors.  All  the  times  reported  in  this  table  are  in  seconds. 

•  Average  Individual  Benchmark  Measures  Chart:  The  OOl  benchmju-k  requires  that 
each  measurement  be  performed  10  times  for  each  complete  iteration  of  the  bench¬ 
mark.  These  10  runs  are  labeled  Run  1  through  Run  10  in  this  chart  and  in  the  raw 
benchmark  results  table.  Rim  1  is  reported  as  the  cold  result  and  the  average  of  runs 
2  through  10  is  reported  as  the  warm  result.  This  chart  conveys  two  things.  First, 
the  L-l-T-i-I  line  displays  the  average  total  for  each  of  the  10  runs.  This  line  gives  an 
graphical  picture  of  the  the  average  benchmark  total  time  throughout  the  10  runs. 
Second,  the  three  bars  displayed  for  each  run  provide  a  graphical  view  of  the  rela¬ 
tionship  between  the  individual  benchmark  measures  to  the  L-l-T-l-I  line  throughout 
the  10  runs.  It  is  important  to  note  that  the  y-axis  scale  of  this  chart  is  different 
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for  each  benchmark  configuration,  so  care  must  be  taken  when  using  this  chart  for 
comparisons. 

The  database  load  times  smd  the  size  required  for  the  001  benchmark  database  are 
shown  in  Figure  17  for  the  small  database  and  Figure  18  for  the  large  database.  The  large 
database  build  for  Matisse  could  not  be  automated  (due  to  a  lack  of  disk  space)  and  was 
carried  out  in  about  1.5  weeks. 
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B.  1  001  Benchmark  Results  for  the  Itasca  DBMS 

This  section  reports  our  OOl  benchmark  results  for  the  Itasca  DBMS.  Results  for 
the  following  database  configurations  are  provided: 

•  Itasca  Small  Remote  Database  —  The  results  for  this  benchmark  configuration  are 
reported  in  Ihbles  19,  20,  and  21  and  in  Figures  30  and  31. 

•  Itasca  Small  Local  Database  —  The  results  for  this  benchmark  configuration  are 
reported  in  Tables  22,  23,  and  24  and  in  Figures  32  and  33. 

•  Itasca  Small  Remote  Database  with  No  Locality  of  Reference  (NLOR)  —  The  re¬ 
sults  for  this  benchmark  configuration  are  reported  in  Tables  25,  26,  and  27  and  in 
Figures  34  and  33. 

•  Itasca  Small  Local  Database  with  No  Locality  of  Reference  —  The  results  for  this 
benchmark  configuration  arc  reported  in  Tables  28  and  29  and  in  Figures  36  and 
37.  The  results  for  this  benchmark  configuration  are  limited  because  the  Itasca 
DBMS  did  not  have  enough  memory  to  execute  in  this  configuration.  Only  partial 
benchmark  runs  completed  and  no  reverse  traversal  measures  ran  to  completion. 
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Figure  30.  Itasca  OOl  Average  Benchmark  Results  for  Small  Remote  Database 


TVtble  20.  Itasca  OOl  Normalized  Reverse  lYsversal  Results  for  Small  Remote  Database 
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Figure  31.  Itasca  OOl  Average  Individual  Benchmark  Measures  (and  Benchmark  Total) 
Across  the  Ten  Runs  for  SmtUl  Remote  Database 
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Figure  32.  Itasca  <X)1  Average  Benchmark  Results  for  Small  Local  Database 


IVible  23.  Itasca  OOl  Normalized  Reverse  IVaversal  Results  for  Small  Local  Database 


Benchmark  Ck>ld  Warm 


258.015  776.981 


Average:  11  2160.827  537.353 


Sample  STD:  ||  2633.055  370.426 
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Figure  33.  Itasca  OOl  Average  Individual  Benchmark  Measures  (and  Benchmark  Total) 
Across  the  Ten  Runs  for  Small  Local  Database 
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Elapsed  Time  in  Seconds 


Figure  34.  Itasca  OOl  Average  Benchmark  Results  for  Small  Remote  Database  (NLOR) 


Table  26.  Itasca  OOl  Normalized  Reverse  Traversal  Results  for  Small  Remote  Database 
(NLOR) 


Benchmark 

Cold 

Warm 

1 

303.841 

424.245 

2 

438.734 

1573.076 

3 

530.117 

400.771 

4 

332.671 

271.608 

5 

401.202 

309.705 

Average: 

401.313 

595.881 

Sample  STD: 

89.726 

549.883 
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Figure  35.  Itasca  OOl  Average  Individual  Benchmark  Measures  (and  Benchmark  Total) 
Across  the  Ten  Rxins  for  SmcJl  Remote  Database  (NLOR) 
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Figiire  36.  Itasca  001  Average  Benchmark  Results  for  Small  Local  Database  (NLOR) 


131 


132 


Elapsed  Time  in  Seconds 


Figure  37.  Itasca  001  Average  Individual  Benchmark  Measmes  (and  Benchmark  Total) 
Across  the  Ten  Runs  for  Small  Local  Database  (NLOR) 
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B.2  001  Benchmark  Results  for  the  Matisse  DBMS 

This  section  reports  our  001  benchmark  results  for  the  Matisse  DBMS.  Some  bench¬ 
mark  runs  were  duplicated  to  determine  the  impact  of  the  mts^collect^versions  program, 
required  by  Matisse  to  compact  disk  space  in  the  database,  on  performance.  Results  for 
the  following  database  configurations  are  provided: 

•  Matisse  Small  Remote  Database  —  The  results  for  this  benchmark  configuration  are 
reported  in  Tables  30,  31,  and  32  and  in  Figures  38  and  39.  The  mts.collect.versions 
program  was  executed  2dter  each  benchmark  iteration  with  no  delay  between  the 
execution  and  the  start  of  the  next  benchmark  iteration. 

•  Matisse  Small  Local  Database  —  The  results  for  this  benchmark  configuration  are 
reported  in  Tables  33,  34,  and  35  and  in  Figures  40  and  41.  The  mts.collectjversions 
program  was  executed  after  each  benchmark  iteration  with  no  delay  between  the 
execution  and  the  start  of  the  next  benchmark  iteration. 

•  Matisse  Small  Local  Database  (200  second  delay)  —  The  results  for  this  benchmark 
configuration  are  reported  in  Tables  36,  37,  and  38  and  in  Figures  42  and  43.  The 
mts^collect-versions  program  was  executed  after  each  benchmark  iteration.  A  200 
second  delay  from  the  start  of  the  version  collection  program  to  the  start  of  the  next 
benchmark  iteration  was  done  to  allow  version  collection  without  iterference  from 
the  benchm2u:k  program. 

•  Matisse  Small  Remote  Database  with  No  Locality  of  Reference  (NLOR)  —  The 
results  for  this  benchmark  configuration  are  reported  in  Tiibles  39,  40,  and  41  and 
in  Figures  44  and  45.  The  mts.collect-versions  program  was  executed  after  each 
benchmark  iteration  with  no  delay  between  the  execution  and  the  start  of  the  next 
benchmark  iteration. 

•  Matisse  Small  Remote  Database  with  No  Locality  of  Reference  (200  second  delay) — 
The  results  for  this  benchmark  configuration  are  reported  in  Tables  42,  43,  and  44 
and  in  Figures  46  zmd  47.  The  mts^collectjversions  program  was  executed  after  each 
benchmark  iteration.  A  200  second  delay  from  the  start  of  the  version  collection 
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program  to  the  start  of  the  next  benchmark  iteration  was  done  to  allow  version 
collection  without  iterference  from  the  benchmark  program. 

•  Matisse  Smtdl  Local  Database  with  No  Locality  of  Reference  —  The  results  for  this 
benchmark  configuration  are  reported  in  Ihbles  45,  46,  and  47  and  in  Figures  48  and 
49.  The  mts-colhct-versions  program  was  executed  after  each  benchmark  iteration 
with  no  delay  between  the  execution  and  the  start  of  the  next  benchmark  iteration. 

•  Matisse  Large  Remote  Database  —  The  results  for  this  benchmark  configuration  are 
reported  in  Tables  48,  49,  and  50  and  in  Figures  50  and  51.  Only  four  out  of  five 
of  the  benchmark  runs  completed  in  this  configuration  due  to  problems  with  the 
Matisse  server. 

•  Matisse  Large  Local  Database  —  The  results  for  this  benchmark  configuration  are 
reported  in  Ihbles  51,  52,  and  53  and  in  Figures  52  and  53.  Only  four  out  of  five 
of  the  benchmark  runs  completed  in  this  configuration  due  to  problems  with  the 
Matisse  server. 
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Figure  38.  Matisse  OOl  Average  Bendunark  Results  for  Small  Remote  Database 


Tbble  31.  Matisse  OOl  Normalized  Reverse  TVaversal  Results  for  Small  Remote  Database 
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Figure  39.  Matisse  OOl  Average  Individual  Benchmark  Measures  (and  Benchmark  To¬ 
tal)  Across  the  Ten  Runs  for  Small  Remote  Database 
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Figure  40.  Matisse  OOl  Average  Benchmark  Results  for  Small  Local  Database 


Ibble  34.  Matisse  OOl  Normalized  Reverse  IVaversal  Results  for  Small  Local  Database 
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Figure  42.  Matisse  001  Average  Benchmark  Results  for  Small  Local  Database  200  Sec¬ 
ond  Delay  for  Version  Collection 


Table  37.  Matisse  OOl  Normalized  Reverse  Traversal  Results  for  Small  Local  Database 
200  Second  Delay  for  Version  Collection 
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Figure  43.  Matisse  OOl  Average  Individual  Benchmark  Measures  (and  Benchmark  To¬ 
tal)  Across  the  Ten  Runs  for  Small  Local  Database  200  Second  Delay  for 
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Figure  44.  Matisse  001  Average  Benchmark  Results  for  Small  Remote  Database 
(NLOR) 


Table  40.  Matisse  001  Normalized  Reverse  li-aversal  Results  for  Small  Remote  Database 
(NLOR) 
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Sample  STD: 
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Figure  45.  Matisse  001  Average  Individual  Benchmark  Measures  (and  Benchmark  To¬ 
tal)  Across  the  Ten  Runs  for  Small  Remote  Database  (NLOR) 
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Figure  46.  Matisse  OOl  Average  Benchmark  Results  for  Small  Remote  Database 
(NLOR)  200  Second  Delay  for  Version  Collection 


IVible  43.  Matisse  OOl  Normalized  Reverse  Traversal  Results  for  Small  Remote  Database 
(NLOR)  200  Second  Delay  for  Version  Collection 
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Figure  47.  Matisse  001  Average  Individual  Benchmark  Measures  (and  Benchmark  To¬ 
tal)  Across  the  Ten  Runs  for  Small  Remote  Database  (NLOR)  200  Second 
Delay  for  Version  Collection 
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Figure  48.  Matisse  OOl  Average  Benchmark  Results  for  Small  Local  Database  (NLOR) 


Thble  46.  Matisse  OOl  Normalized  Reverse  Traversal  Results  for  Small  Local  Database 
(NLOR) 
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Figiire  49.  Matisse  001  Average  Individual  Benchmark  Meeisures  (and  Benchmark  To¬ 
tal)  Across  the  Ten  Runs  for  Small  Local  Database  (NLOR) 
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Figure  50.  Matisse  OOl  Average  Benchmark  Results  for  Large  Remote  Database 


Table  49.  Matisse  OOl  Normalized  Reverse  IVaverssd  Results  for  Large  Remote  Database 
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Figure  51.  Matisse  001  Average  Individual  Benchmark  Measures  (and  Benchmark  To¬ 
tal)  Across  the  Ten  Runs  for  Large  Remote  Database 
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Table  52.  Matisse  OOl  Normalized  Reverse  Traversal  Results  for  Large  Local  Database 
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B.3  001  Benchmark  ReauHs  for  the  OhjectStore  DBMS 


This  section  reports  our  OOl  benchmark  results  for  the  ObjectStore  DBMS.  Results 
for  the  following  database  configurations  are  provided: 

•  ObjectStore  Small  Remote  Database  —  The  results  for  this  benchmark  configuration 
are  reported  in  Thbles  54,  55,  and  56  and  in  Figures  54  and  55. 

•  ObjectStore  Small  Local  Database  —  The  results  for  this  benchmark  configuration 
are  reported  in  Tables  57,  58,  and  59  and  in  Figures  56  and  57. 

•  ObjectStore  Small  Local  Database  (second  complete  run)  —  The  results  for  this 
benchmark  configuration  are  reported  in  Tables  60,  61,  and  62  and  in  Figures  58 
and  59. 

•  ObjectStore  Small  Remote  Database  with  No  Locality  of  Reference  (NLOR)  —  The 
results  for  this  benchmark  configuration  are  reported  in  Tables  63,  64,  and  65  and  in 
Figures  60  and  61. 

•  ObjectStore  Small  Local  Database  with  No  Locality  of  Reference  —  The  results  for 
this  benchmark  configuration  are  reported  in  Tables  66,  67,  and  68  and  in  Figmes  62 
and  63. 

•  ObjectStore  Large  Remote  Database  —  The  results  for  this  benchmark  configuration 
are  reported  in  Tables  69,  70,  2aid  71  wd  in  Figures  64  and  65. 

•  ObjectStore  Large  Local  Database  —  The  results  for  this  benchmark  configuration 
are  reported  in  Ihbles  72,  73,  and  74  and  in  Figures  66  and  67. 
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Figure  54.  ObjectStore  OOl  Average  Benchmark  Results  for  Small  Remote  Database 


Table  55.  ObjectStore  OOl  Normalized  Reverse  Traversal  Results  for  Small  Remote 
Database 


Benchmuk 

Cold 

Warm 

1 

15.969 

15.881 

2 

5809.241 

2.794 

3 

11.617 

1.993 

4 

17.484 

1.842 

5 

22.339 

1.409 

Average: 

1157.330 

4.784 

Sample  STD: 

2590.438 

6.224 
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Figure  55.  ObjectStore  001  Average  Individual  Benchmark  Measures  (and  Benchmark 
Total)  Across  the  Ten  Runs  for  Small  Remote  Database 
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Figiire  56.  ObjectStore  001  Average  Pendimark  Results  for  Small  Local  Database 


Table  58.  ObjectStore  001  Normalized  Reverse  IVaversal  Results  for  Small  Local 
Database 


Benchmark 

Cold 

Warm 

1 

21.565 

15.664 

2 

10433.572 

2.835 

3 

22.718 

1.291 

4 

26.808 

2.140 

5 

37.703 

1.912 

Average; 

2108.473 

4.768 

Sample  STD: 

4653.876 

6.116 
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Figure  57.  ObjectStore  001  Average  Individual  Benchmark  Measures  (and  Benchmark 
Total)  Across  the  Ten  Runs  for  Small  Local  Database 


176 


177 


Lookup  □  Traverse  ■Insert 


Figure  58.  ObjectStore  001  Average  Benchmark  Results  for  Small  Local  Database 
(second) 


Table  61.  ObjectStore  OOl  Normalized  Reverse  Traversal  Results  for  Small  Local  Data¬ 
base  (second) 


Benchmark 

Cold 

Warm 

1 

21.292 

16.361 

2 

10177.650 

2.803 

3 

21.662 

1.434 

4 

27.452 

2.146 

5 

39.282 

1.859 

Average: 

2057.468 

4.921 

Sample  STD: 

4539.326 

6.415 
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Figure  60. 


Table  64. 
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ObjectStore  001  Average  Benchmark  Results  for  Small  Remote  Database 
(NLOR) 


ObjectStore  001  Normalized  Reverse  Traversal  Results  for  Small  Remote 
Database  (NLOR) 


Benchmark 

Cold 

Warm 

1 

8.168 

25.419 

2 

191.780 

31.861 

3 

101.563 

2.168 

4 

28.905 

1.554 

5 

61.583 

2.856 

Average: 

78.400 

12.772 

Sample  STD: 

72.555 

14.671 
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Figure  61.  ObjectStore  001  Average  Individual  Benchmark  Measures  (and  Benchmark 
Total)  Across  the  Ten  Runs  for  Small  Remote  Database  (NLOR) 
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Figure  62.  ObjectStore  001  Average  Benchmuk  Results  for’  Small  Local  Database 
(NLOR) 


Table  67.  ObjectStore  001  Normalized  Reverse  Traversal  Results  for  Small  Local  Data¬ 
base  (NLOR) 


Benchmark 

Cold 

Warm 

1 

19.324 

47.308 

2 

227.614 

35.472 

3 

121.962 

2.120 

4 

45.424 

1.661 

5 

106.063 

2.388 

Average: 

104.077 

17.790 

Sample  STD: 

80.945 

21.948 

186 


187 


Elapsed  Time  in  Seconds 


120 

100 

80 

60 

40 

20 

0 


♦  L+T+l  ■  Lookup  □  T raverse  ■  Insert 


Figure  63.  ObjectStore  001  Average  Individual  Benchmark  Measures  (and  Benchmark 
Total)  Across  the  Ten  Runs  for  Small  Local  Database  (NLOR) 


188 


189 


Elapsed  Time  in  Seconds 


Figure  64.  ObjectStore  OOl  Average  Benchmark  Results  for  Large  Remote  Database 


Table  70.  ObjectStore  OOl  Normalized  Reverse  Traversal  Results  for  Large  Remote 
Database 


Benchmark 

Cold 

Warm 

1 

236.755 

187.308 

2 

277.469 

469.744 

3 

220.835 

195.891 

4 

8785.680 

584.968 

5 

342.571 

204.768 

Average: 

1972.662 

328.536 

Sample  STD: 

3808.883 

186.115 
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Figtire  65.  ObjectStore  OOl  Average  Individual  Benchmark  Measures  (and  Benchmark 
Total)  Across  the  Ten  Runs  for  Large  Remote  Database 
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Figure  66.  ObjectStore  OOl  Average  Benchmark  Results  for  Large  Local  Database 


Thble  73.  ObjectStore  OOl  Normalized  Reverse  TVaversal  Results  for  Large  Local 
Database 
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Figure  67.  ObjectStore  OOl  Average  Individual  Benchmark  Measures  (and  Bendunark 
Total)  Across  the  Ten  Runs  for  Large  Local  Database 
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Appendix  C.  Statistical  Analysis  of  the  001  Benchmark  Results 


To  determine  if  the  differences  in  the  OOl  benchmark  results  were  meaningful,  we 
ran  a  small-sample  test  of  hypothesis  for  the  difference  between  population  means  on  our 
results.  The  statistical  test  used  in  this  appendix  was  described  by  McClave  zuid  Benson 
in  [26]. 

For  the  OOl  benchmark  results,  we  were  interested  in  determining  if  a  difference 
existed  between  results  from  two  different  databases  on  the  same  benchmark  configuration 
or  the  same  database  on  two  different  benchmark  configurations.  For  example,  is  the  mean 
lookup  time  for  the  ObjectStore  DBMS  different  than  the  mean  lookup  time  for  the  Itasca 
DBMS,  and  if  so,  which  is  faster.  The  population  mean  p  for  our  benchmark  results  is 
unknown,  we  only  know  the  sample  mean  x  for  our  five  runs.  Therefore,  assigning  pi  to 
be  the  population  mean  of  one  of  the  results  and  p2  to  be  the  population  meam  of  the 
second,  we  wanted  to  detect  a  difference  between  pi  and  p2  if  only  if  a  difference 
exists.  Therefore,  we  tested  the  null  hypothesis  shown  in  Equation  2, 


Ho  :  (pi  ~P2)  =  0 


(2) 


against  the  alternative  hypothesis  shown  in  Equation  3. 


Ha  :  (pi  -  P2)  ^0  (i.e., either  pi  >  p2  or  p2>  pi)  (3) 


To  perform  this  test,  we  first  calculated  the  test  statistic  using  Equation  4.  In  Equation  4, 
n  is  the  number  of  samples  taken  (which  was  adways  5  for  oiu:  work),  x  is  the  sample  mean, 
and  s  is  the  sample  standard  deviation. 


t 


(*i  -  *2)  -  0 


(ni-l)»?+(ni-l)»j  /  1  , 

ni+n«— 2  *ni  '  n*  ' 


na  ' 


(4) 


The  test  statistics  calculated  for  the  Small  Remote  database  results  are  shown  in  Table  75. 
The  test  statistics  calculated  for  the  Small  Local  database  results  are  shown  in  Table  76. 
The  test  statistics  calculated  for  the  Small  Remote  database  with  no  locality  of  reference 
resiilts  are  shown  in  Table  77.  The  test  statistics  calculated  for  the  Small  Local  database 
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with  no  locality  of  reference  results  are  shown  in  Table  78.  The  test  statistics  calculated 
for  the  Large  Remote  database  results  are  shown  in  Table  79.  The  test  statistics  calculated 
for  the  Large  Local  database  results  are  shown  in  Table  80.  The  test  statistics  calculated 
to  compare  Matisse  database  results  for  test  runs  with  no  delay  for  version  collection  with 
test  runs  allowing  a  200  second  delay  are  shown  in  Table  81.  The  test  statistics  calculated 
to  compare  Matisse  database  results  for  a  Remote  client  with  a  Local  client  are  shown 
in  Table  82.  The  test  statistics  calculated  to  compare  ObjectStore  database  results  for  a 
Remote  client  with  a  Local  client  are  shown  in  Table  83. 

A  calculated  test  statistic  may  be  compared  with  the  rejection  region  of  our  test. 
The  rejection  region  is  determined  from  Student’s  t  distribution.  The  degrees  of  freedom 
used  for  our  test  was  8  (rij  d-rii— 2  =  5  +  5  —  2  =  8)  and  we  choose  an  a  of  0.05.  The 
rejection  region  for  our  test  was  t  <  —ta  or  t  >  tii  where  t^  is  based  on  8  degrees  of 
freedom.  Hence,  we  can  make  one  of  the  following  conclusions: 

•  The  first  benchmark  result  is  faster  than  the  second  benchmark  result  if  t  <  -2.306 

•  The  second  benchmark  result  is  faster  than  the  first  benchmark  result  if  <  >  2.306 

•  The  sample  evidence  is  insufficient  to  reject  the  null  hypothesis  at  a  =  0.05 
Our  test  results  are  statistically  significant  at  the  a  —  0.05  level  of  significance. 

To  allow  the  use  of  our  statistical  test,  we  had  to  assume  that  the  population  standard 
deviation  for  both  samples  was  equal.  This  assumption  is  probably  reasonable  because  most 
sample  variation  was  due  to  random  number  variations,  and  system  loading  variations.  The 
random  number  variations  were  the  same  for  all  samples  (the  same  random  number  streams 
used  for  one  group  of  benchmark  runs,  was  used  for  them  all).  We  controlled  the  system 
load  variations  (operating  system,  etc.)  by  ensuring  that  our  benchmark  was  the  only 
non-system  job  running  on  the  computer  during  a  benchmark  run. 
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Table  75.  SmsJl-S2uiiple  Test  Statistic  for  OOl  Benchmark  Small  Remote  Database  Results 
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Table  76.  Small-Sample  Test  Statistic  for  OOl  Benchmark  Small  Local  Database  Results 
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IVible  78.  Small-Sample  Test  Statistic  for  001  Benchmark  Small  Local  Database  Results 
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Appendix  D.  Detailed  Sirmdation  Benchmark  Results 

This  appendix  contains,  in  detail,  the  results  obtained  from  our  runs  of  the  simulation 
benchmark.  Benchmark  results  ue  included  for  the  ObjectStore  DBMS  version  of  the 
benchmark  and  the  non-persistent  version  of  the  benchmark. 

For  each  different  benchmark  configuration,  five  complete  runs  of  the  benchmark 
were  made.  Five  runs  were  made  rather  than  one  to  try  to  obtain  a  better  picture  of 
typical  performance,  not  just  a  single  snapshot.  This  appendix  contains  both  raw  and 
summary  results  for  all  the  benchmark  configurations.  Benchmark  results  are  reported 
using  two  tables  and  one  chart.  These  are  described  below  in  the  order  which  they  appear 
for  each  benchmark  configuration. 

•  Summary  Results  Table:  This  table  reports  the  times  for  all  five  runs  of  the  simulation 
benchmark.  The  average  and  standard  deviation  of  the  five  runs  is  given  at  the 
bottom  of  the  table.  All  the  times  reported  in  this  table  are  in  seconds. 

•  Throughput  Results  Table:  This  table  reports  the  observed  actual  time  ratios  during 
the  throughput  measure.  The  time  ratio  of  the  simulation  is  defined  as  the  ratio  of 
wall  clock  time  to  simulation  time.  The  reported  value  for  the  throughput  measure 
is  the  90th  percentile  of  the  samples,  but  this  table  also  reports  the  geometric  mean. 

•  Hour  Run  Results  Chart:  This  chart  provides  a  cleeu:  pictxire  of  the  effect  time  slice 
has  on  the  hour  run  residts.  The  time  slice  defines  the  transaction  size  for  the 
object-oriented  DBMS.  As  the  time  slice  increases,  less  transactions  are  run  during 
the  simulated  hour.  It  is  important  to  note  that  the  axis  scales  of  this  cheut  sure 
different  for  each  benchmark  configuration,  so  care  must  be  taken  when  using  this 
chart  for  comparisons. 
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D.0.1  Simulation  Benchmark  Resuha  for  the  OhjectStore  DBMS.  This  section 
reports  our  results  for  the  OhjectStore  DBMS  version  of  the  simulation  benchmark.  Results 
for  the  following  database  configurations  are  provided: 

•  OhjectStore  Small  Remote  Database  —  The  results  for  this  benchmark  configuration 
2ure  provided  in  Tables  84  and  85  and  in  Figure  68. 

•  OhjectStore  Small  Local  Database  —  The  results  for  this  benchmark  configuration 
are  provided  in  Tables  86  and  87  and  in  Figure  69. 
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Elapsed  Time  in  Seconds 


Table  85.  ObjectStore  Simulation  Benchmark  Throughput  Results  for  Small  Remote 
Database 
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Figure  68.  ObjectStore  Simulation  Benchmark  Hour  Rim  Results  by  Time  Slice  Setting 
for  Small  Remote  Database 
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Elapsed  Time  in  Seconds 


Table  87.  ObjectStore  Simulation  Benchmark  Throughput  Results  for  Small  Local 
Database 


Figure  69.  ObjectStore  Simulation  Benchmark  Hoiur  Run  Results  by  Time  Slice  Setting 
for  Small  Local  Database 
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D.0.2  Non-Persistent  Simulation  Benchmark  Results.  This  section  reports  our 
results  for  the  non-persistent  version  of  the  simulation  benchmark.  Results  for  the  following 
database  configmations  are  provided: 

•  Non-Persistent  Small  Database  —  The  results  for  this  benchmark  configuration  are 
provided  in  Tables  88  and  89  and  in  Figure  70. 
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l^ble  89.  Non-Persistent  Simulation  Benchmark  Throughput  Results  for  Small  Database 

~  Throughput  Samples  | 
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Figure  70.  Non-Persistent  Simulation  Benchmark  Hour  Run  Results  by  Time  Slice  Set¬ 
ting  for  Small  Database 
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Appendix  E.  Statistical  Analysis  of  the  Simulation  Benchmark  Results 

To  determine  if  the  differences  in  the  simulation  benchmark  results  were  meaningful 
we  ran  a  small-sample  test  of  hypothesis  for  difference  between  population  means  on  our 
results.  The  statistical  test  used  in  this  appendix  was  described  by  McClave  and  Benson 
in  [26].  This  test  was  also  used  in  Appendix  C. 

For  the  simulation  benchmark  results  we  were  interested  in  determining  if  a  difference 
existed  between  results  from  two  different  benchmark  configurations. 

The  calcxilated  test  statistic  is  compmed  with  the  rejection  region  of  our  test.  The 
test  statistics  calculated  for  the  Small  database  results  are  shown  in  Ihble  90.  The  rejection 
region  was  determined  from  Student’s  t  distribution.  The  rejection  region  for  our  test  was 
t  <  —t^  or  t  >  t^  where  ta  is  based  on  8  degrees  of  freedom.  Hence,  we  can  make  one  of 
the  following  conclusions: 

•  The  first  benchmark  result  is  faster  than  the  second  benchmark  result  if  t  <  —2.306 

•  The  second  benchmark  result  is  faster  than  the  first  benchmark  result  if  t  >  2.306 

•  The  sample  evidence  is  insufficient  to  reject  the  null  hypothesis  at  a  =  0.05 
Our  test  results  are  statistically  significant  at  the  a  =  0.05  level  of  significance. 

To  allow  the  use  of  our  statistical  test,  we  had  to  assume  that  the  population  standard 
deviation  for  both  samples  was  equad. 
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Table  90.  Small-Sample  Test  Statistic  for  Simulation  Benchmark  Small  Database  Results 
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0.007  0.065  1.916  0.496  0.592 _ 0.774  0.052  0.287  0.186 

18.015  30.456  75.714  320.708  304.332  168.710  89.990  -5.476  1.624 


Appendix  F.  flenc/imorjb  Suj^port  Library 

This  appendix  contains  docxunentation  and  source  code  for  the  benchmark  support 
library  developed  for  this  research.  This  library  was  developed  to  speed  implementation 
of  benchmark  programs.  The  benchmark  support  library  facilitated  code  reuse  between 
benchmark  program  implementations.  The  bendunark  support  library  consists  of  routines 
for  measuring  elapsed  time,  and  routines  for  generating  random  numbers.  The  library 
consists  of  the  following  four  modules: 

•  The  stopwatch  module 

•  The  dice  module 

•  The  export  module 

•  The  pmmlcg  U(0,1)  pseudo-random  number  generator  module 

Figure  71  shows  a  Coad/Yourdon  OOA  diagram  of  the  benchmark  support  Ubrary  (see 
Appendix  A  for  a  summary  of  the  Coad/Yourdon  OOA  notation). 

The  code  developed  for  the  benchmark  support  library  is  in  compliant  with  the  coding 
standards  developed  for  this  research.  These  standards  are  described  in  Appendix  G. 

F.l  The  Stopwatch 

Elapsed  time  is  the  performance  measurement  used  by  the  001  benchmark  and 
the  simiilation  benchmark.  To  allow  accurate  measurement  of  elapsed  time,  a  C++  class 
called  stopwatch  was  developed.  The  stopwatch  class  uses  the  system  clock  to  obtain 
accurate  timing.  The  benchmarks  used  in  this  research  only  require  accuracy  to  about  a 
millisecond,  and  to  this  level  the  system  clock  acciuracy  is  reasonable.  The  basic  timing 
components  of  this  class  were  also  used  in  the  007  research  project  at  the  University  of 
Wisconsin-Madison  [7]. 

The  stopwatch  class  is  used  by  creating  an  instance  of  the  class  (usually  through 
a  declaration).  Timing  is  started  by  calling  the  start  method.  Timing  is  stopped  and 
recorded  by  calling  the  stop  method. 
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Figure  71.  OOA  Diagram  for  the  Benchmark  Support  Library 


F.1.1  8topwtch.hh.  The  source  code  for  the  sole  is  listed  below. 

1  firndcf  __STOPHTCH.HH 

2  fdafin*  ._STOPWTCB.HH 

3  /• 


4  fffft 


5 

f  f 

Stff# 

9999 

99999 

9 

9 

99 

99999 

9999 

9  9 

6 

« 

* 

9  9 

9  9 

9 

9 

9  9 

9 

9  9 

9  9 

7 

MfM 

i 

9  9 

9  9 

9 

9 

9  9 

9 

9 

999999 

8 

i 

« 

9  9 

99999 

9  99 

9 

999999 

9 

9 

9  9 

9 

t  i 

i 

9  9 

9 

99 

99 

9  9 

9 

9  9 

f  9 

10 

ttt#« 

« 

9999 

9 

9 

9 

9  9 

9 

9999 

9  9 

11  stopwtch.Ui 

12  Air  Fore*  Institvt*  of  Tochnology 

13  Tioothy  J.  Halloraa 

14  20  May  1993 

16  lOTE:  many  of  tha  tochBiqaaa  asad  in  tUs  aapport  packaga  cai»a  froai 

16  tha  "Snpport.C”  packaga  craatad  by  Caray,  DaHitt.  and  langhton 

17  for  tha  007  banchmark  (Objactiaity  i^plaMatatioa) . 

18  [007  Baachmark  C0PTU6HT  (C)  1993  Caray  DaHitt  langhton 

19  Madison.  HI  O.S.A.  ALL  BIGHTS  RESERTED] 

20  Raaisiona: 
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21  16  Ju  1993  -TJH-  Hodifiad  to  inclmdo  <sys/tiM.k>  to  kido  ■om-portablo 

22  systoM  tiaiag. 

23  •/ 

24  fiaclndo<ays/tiB«.k> 

25  class  stopsatck  { 

26  struct  tisMval  start .tiaa; 

27  struct  tiMual  stop.tiaa; 

28  Int  clock.ruBBiug: 

29  public: 

30  stop*atck(  ) ; 

31  void  start(  );  //  starts  tialng 

32  doubla  stop(  );  //  stops  tiaiag.  aad  raturms  tka  duratioa  ia  sacoads 

33  }; 

34  taadif  ._STOPVTCH_HH 

F.1.2  $topwtch.cc.  The  source  code  for  the  file  stopwtch.cc  is  listed  below. 

1  /• 


2  Mtf# 


3 

t  t 

•8888 

8888 

88888 

8 

8 

88 

88888 

8888 

8  8 

4 

i 

• 

8  8 

8  8 

8 

8 

8  8 

8 

8  8 

•  8 

5 

•«ft8 

8 

8  8 

8  8 

8 

8 

8  8 

8 

8 

f if### 

6 

8 

8 

8  8 

88888 

8  88 

8 

888888 

• 

8 

8  8 

7 

•  « 

8 

8  8 

8 

88 

88 

•  8 

• 

8  8 

8  8 

8 

8«ifi 

8 

8888 

8 

8 

8 

8  8 

8 

8888 

8  8 

9  stopvtck.cc 

10  Air  Korea  lastituta  of  Tackaology 

11  Tistotky  J.  Halloran 

12  20  Nay  1993 

13  lOTE:  aaay  of  tka  tackaiquas  usad  ia  tkis  support  packaga  caaa  froa 

14  tka  "Support. C"  packaga  craatad  by  Caray,  DaHitt,  aad  laugktoa 

15  for  tka  007  baackaark  (ObjsctiTity  ii^lsaaatatioa) . 

16  [007  Baackaark  C0PTRI6BT  (C)  1993  Caray  DaVitt  laughtoa 

17  Nadisoa,  VI  O.S.A.  ALL  EIOHTS  HESERTED] 

18  Ravisioas : 

19  06  Jul  1993  -TJH-  Ckaagad  all  tka  dabug  output  to  usa  tbs  "dabug.kk"  aacrps. 

20  •/ 

21  8iBcluda<stdio.k> 

22  tiaclada<Bys/tiaa.k> 

23  fiacluda<dabug.kk> 

24  tiacluda"stop«tck.kk" 

25  idafiaa  TRUE  1 

26  *dafiaa  FALSE  0 

27  ////////////////////////////////////////////////////////////////////// 

28  void  stopsatck: : start (  ) 

29  { 

30  DEBUG.IirK  "STOPWATCH" ,  "stopsatck: : start”  ); 

31  DEBUG_0UT(  "aatariag",  1  ); 

32  if  (  gattiBaofday(  Astart.tiaa,  (struct  tiaazoaa  •)0  )  )  { 
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S3  fpriBtf(  ■tdwrr.  ''EMUlS[atop«atch:  latart]  failed  call  to  gatt  i  aeof day  (  )\a''  ); 

34  } 

35  clock.raaaing  ■  TRUE; 

36  DDOG.OVK  spriBtf(  BOG,  “aacoads  alaca  Jaa  1,  1970:  Xld  aicroaacoada :  Xld", 

37  atart_tlJM.tT.aac.  atart.tiaa.tT.aaac  ),  2  ); 

38  } 

30  ////////////////////////////////////////////////////////////////////// 

40  doabla  atopaateb: :atop(  ) 

41  { 

42  DEBUG.Iirr(  "STOPWATCH",  "atopaatch: :atop"  ); 

43  DEBOG.OOT(  "aatariag".  1  ); 

44  //  gat  tha  carraat  tiaa  firat  (do  error  ckackiag  latar) 

45  if  (  gattiaaofdayC  Aatop.tiaa,  (atract  tiaaaoaa  *)0  )  )  { 

46  fpriatf(  atdarr,  "EBROR[atopaatch: :atop]  failed  call  to  gattiaaofdayC  )\a”  ); 

47  rataraC  0.0  ); 

48  } 

49  //  Bake  aura  the  clock  aaa  zanaiag 

50  if  (  !  clock.maniag  )  { 

51  //  can't  atop  tka  atopaatch  before  yon  atart  it 

52  fprintf(  atdarr,  "ERROR [atopaatch :: atop]  atop  called  before  atart \n"  ); 

53  ratnraC  0.0  ); 

64  } 

55  clock.xannlng  >  FALSE; 

56  DEBOG_OnT(  aprintf(  BOG,  "aaconda  ainca  Jaa  1,  1970:  Xld  nicroaeconda :  Xld", 

57  atart.tiaa.tT.aac,  atart. time. tv.naac  ),  2  ); 

58  //  compute  and  ratnm  the  dnration 

59  doabla  aaconda  ■  doable (  atop.tima.tT.aac  •  atart . time. ta.aac  ); 

60  doabla  micro .aaconda  ■  doabla(  atop .time. tn.aaac  -  atart.tima.tT.aaac  ); 

61  if  (  micro.aaconda  <  0.0  )  { 

62  micro.aaconda  ■  1000000.0  micro.aaconda; 

63  aaconda — ; 

64  } 

65  ratomC  aaconda  micro.aaconda/lOOOOOO.O  }; 

66  } 

67  ////////////////////////////////////////////////////////////////////// 

68  atopaatch: :atopvatch(  ) 

69  { 

70  DEBOG.IIITC  "STOPWATCH",  "atopaatch: :atopaatch"  ); 

71  DEBOG_OUT(  "entering",  1  ); 

72  //  ahan  an  object  ia  firat  created,  the  clock  ia  not  ronning 

73  clock.ronning  >  FALSE; 

74  } 

75  ////////////////////////////////////////////////////////////////////// 

F.2  The  Dice  Module 

The  dice  module  is  used  in  all  the  benchmark  implementations  in  this  thesis.  It  is 
used  to  create  pseudo-random  numbers  for  the  benchmark  programs. 

F.2.1  dice.hh.  The  source  code  for  the  file  dice.hh  is  listed  below. 

1  8ifndaf  ..DICE.HH 

2  fdefina  ..DICE.HH 
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3  /• 


4  tfttt 

5  «  t  t  Mff  MSMi 

6  t  i  •  •  i  • 

7  i  f  •  t  MMt 

8  «  t  t  •  t 

9  t  •  •  t  •  • 

10  9Mf#8  •  »$99§§ 

11  dica.kh 

12  iir  Fore*  Institute  of  Tscluiology 

13  Timothy  J.  Balloran 

14  18  Ang  1993 
16  •/ 

18  long  rolK  long  low,  long  high,  int  atrsaa  ); 

17  tondif  __DICE_HH 

F.S.2  dice.ee.  The  source  code  for  the  file  dice.ee  is  listed  below. 

1  /• 

2  M8t#t 

3  i  t 

4  9  • 

5  8  8 

6  8  8 

7  8  8 

8  888888 

9  dies . cc 

10  Air  Force  Institute  of  Technology 

11  Tiaiothy  J.  Halloran 

12  18  Ang  1993 

13  */ 

14  8inclnde<  stdio . h> 

15  8inclnde<pnBlcg.h> 

16  8inclnde<debng.hh> 

17  8inclnde"dice.hh" 

18  ////////////////////////////////////////////////////////////////////// 

19  long  roll(  long  low,  long  high,  int  strenn  ) 

20  { 

21  DEBaG.IIIT(  "DICE",  "roU"  ); 

22  DEBUG_Oirr(  "entering",  1  ); 

23  //  retnm  a  random  integer  between  "low"  and  "high"  (inclnsiwe) 

24  //  check  for  a  bad  input  paraMtor 

25  if  (  high  <  low  )  retnm  low; 

26  //  paBtlcg_rand(  int  stream  )  returns  non-negative  floating-point 

27  //  values  uniformly  distributed  over  the  interval  (0,1) 

28  long  result  -  low  (long)(  pnmlcg_rand(  stream  )  *  (double)  (  high  -t-  1  -  low  )  ); 

29  DEBV6_0DT(  sprintf(  BUG,  "result  Xd  from  stream  Xd",  result,  stream),  2  ); 

30  retum(  result  ); 


8888  888888 
8  8  8 
8  88888 
8  8 
8  8  8 
8888  888888 
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31  } 

32  ////////////////////////////////////////////////////////////////////// 


F.S  The  Expon  Module 

The  expon  module  is  used  by  the  simulation  benchmark  to  produce  random  numbers 
with  an  exponential  distribution. 

F.3.1  expon.hh.  The  source  code  for  the  file  expon.hh  is  listed  below. 

1  tifadal  ..EZPOl.HH 

2  tdefin*  ..EZPOl.HH 

3  /• 

4  ftifttS 

5  «  «  i  MtM  ttM  «  t 

6«  ##•  «•  SMt 

7  «Mi«  ti  t  «•  itt« 

8  «  M  •••##  i  i  •  t  t 

9i 

10  MMM«  «  if  tftt  t  • 

11  axpoa.lih 

12  Air  Fore*  Institeta  of  Tochnology 

13  Tifflothy  J.  Halloran 

14  31  lag  1993 

16  */ 

16  donbl*  ozponC  doable  moon,  int  straoa  ); 

17  iondif  ..EZPOl.HH 

F.S. 2  expon.ee.  The  source  code  for  the  file  expon.ee  is  listed  below. 

1  /* 

2  MtMtt 

3  «  «  f  •««•#  tiff  i  « 

4«  ft  tfff 

5  §«##«  t«  f  ff  $99* 

6  «  99  99999  *  9  9  9  9 

7  9  tit  ttttt 

8  ttttttt  t  t  t  tttt  t  t 

9  expon. cc 

10  Air  Force  Institnte  of  Technology 

11  Timothy  J.  Halloran 

12  31  Ang  1993 

13  ♦/ 

14  tinclnde<stdio.h> 

16  tinclnde<math.h> 

16  tinclade<piiinlcg.h> 

17  tinclnde<debng.hh> 
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18  •isclttd«''«zpoii.hli" 

19  tinclad*''dic«.li]i” 

20  ////////////////////////////////////////////////////////////////////// 

21  donbl*  axpoaC  doabl*  Man.  lat  straw  ) 

22  { 

23  DSB06_lirr(  ''EXPOI".  "azpoB"  ); 

24  OEB06_OUT(  "mtaring",  1  ); 

25  //  Tstan  aa  azponaatial  raadoai  varlata  with  asaa  “imbb'* 

26  //  paBlcg_raad(  iat  stream  )  retaras  aoa-aagatise  float iag-poiat 

27  //  ralnes  aaifoxsdy  dlstribated  over  tbe  iaterval  (0,1) 

28  retnra  -rneaa  •  log(  (donble)paBlcg_raBd(  stream  )  ); 

29  > 

30  ////////////////////////////////////////////////////////////////////// 


F.4  A  U (0,1)  Random  Number  Generator 

Both  the  dice  module  and  the  expon  module  rely  upon  the  generation  of  random 
numbers  over  U(0,1).  The  U(0,1)  pseudo-random  number  generator  used  for  all  the  pro- 
greuns  in  this  research  corner  from  the  book  Simulation  Modeling  and  Analysis  by  Law  and 
Kelton  [24],  We  decided  to  use  this  code  after  running  some  of  the  statistical  tests  for  a 
U(1,0)  random  number  generator  described  in  by  Law  and  Kelton.  The  UNIX  system  func¬ 
tion  drand48(  ),  and  the  random  number  generator  described  by  Park  and  Miller  in  [30] 
(which  was  used  by  Cattell  and  Skeen  in  the  original  001  benchmark  implementations  [9]) 
were  considered  for  use,  but  the  generator  described  by  Law  and  Kelton  appeared  to  be 
superior. 


F.4.1  pmmlcg.h.  The  source  code  for  the  file  pmmlcg.h  is  listed  below. 

1  Sifndsf  ._PNNLC6_H 

2  «d«fiB«  __PNKLC<:.H 

3  /* 


4 

5 

6 

7 

8 
9 

10 


ttff*#  s  *  i  St 

t  t  M  MM  MS 

t  SttSftSfff 

MMM  •  «  «  S  #  S  • 

«  t  #  «  St 

i  t  «  8  St 

t  t  t  #  t  StitSM 


SMS#  MSM 
St  t 

t 

t  SMS 

t  t 

St  S 

SSSM  SSMS 


11  SSSMS 

12  S  S 

13  S  S 

14  SSSMS 

15  S  S 

16  S  S 


SS  S  S  SMSS  SSM 
SSMSS  SS  S 

S  SSSSS  SS  S 

MSSM  S  S  S  S  S  S  S 

S  SSMS  SS  S 


S  S 
M  M 
SMS 
S  S 
S  S 
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17 

i 

t 

t 

• 

8 

8 

88888 

8888 

8  8 

18 

• 

• 

19 

M 

i 

• 

8 

8 

8 

88888 

888888 

88888 

20 

« 

#  t 

• 

8 

88 

88 

8  8 

8 

8  8 

21 

• 

•  • 

# 

8 

8 

88  8 

88888 

88888 

8  8 

22 

* 

f  i 

* 

8 

8 

8 

8  8 

8 

88888 

23 

• 

• 

8 

8 

8 

8  8 

8 

8  8 

24 

i 

t 

8 

8 

88888 

888888 

8  8 

2S 

ttttt 

26 

• 

t 

8 

8 

888888 

88888 

88 

88888 

8888 

88888 

27 

8 

• 

88 

8 

8 

8  8 

8  8 

8 

8 

8 

8  8 

28 

i 

tMS 

M«f8 

8 

8  8 

88888 

8  8 

8  8 

8 

8 

8 

8  8 

29 

# 

i 

# 

8 

8  8 

8 

88888 

888888 

8 

8 

8 

88888 

30 

i 

t 

i 

8 

88 

8 

8  8 

8  8 

8 

8 

8 

8  8 

31 

MtM 

•88M« 

8 

8 

888888 

8  8 

8  8 

8 

8888 

8  8 

32  •/ 

33  tifdal  cplnaplna 

34  axtarn  "C"  { 

35  iandif 

36  float  pBialcg_raad(  int  atraam  ); 

37  void  pmalcg.randat (  long  zaat,  int  atraan  ); 

38  long  pBHnlcg_randgt  (  Int  atraam  ); 

39  iifdaf  cplnaplna 

40  } 

41  tandif 

42  tandif  ..PMMLCG.H 

F.4’2  pmmlcg.c.  The  source  code  for  the  file  pmmlcg.c  is  listed  below. 

1  /• 


2 

888888 
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8 

8 

8 

8 

88888 
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8 
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8 

8 

8 

5 

888888 

8  8 

8 

8 

8  8 

8 

8 

8  8888 

6 

8 

8 

8 

8 

8 

8 

8 

8  8 

7 

8 

8 

8 

8 

8 

8 

8  8 

8  8 

8 

8 

8 

8 

8 

8 

88888 

88888 

9 

888888 

10 

8  8 

88 

8 

8 

88888 

8888 

8  8 

11 

8  8 

8 

8 

88 

8 

8  8 

8  8 

88  88 

12 

888888 

8 

8 

8 

8  8 

8  8 

8  8 

8  88  8 

13 

8  8 

888888 

8 

8  8 

8  8 

8  8 

8  8 

14 

8  8 

8 

8 

8 

88 

8  8 

8  8 

8  8 

15 

8  8 
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8 
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8888 
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22  •  •  «  t  #*•*••  •  • 

23  Mt«t 

24  •  «  •  t  •*«•#«  ttfit  it  ttti  iitii 

26  i  •  if  it  tit#  ••••# 

26  •  iiii  iitii  i  i  i  iitii  iiii  iitii 

27  i  it  iiii  iiiii  iiiiii  i  i  i  iitii 

28  i  it  iiii  iiii  iiiii 

29  iiiii  iiiiii  i  i  iiiiii  iiii  i  iiii  i  i 

30  Prime  modnlne  miltiplicative  linear  congmeatial  generator 

31  Z[i]  ■  (630360016  *  Z[i-l])  (mod(pov(2,31)  -  1)),  baaed  on  Marse  and 

32  Hoberts’  portable  FORTRil  random-number  generator  UII&AI.  Mnltipla 

33  streams  (100)  are  snpported,  with  seeds  spaced  100,000  apart. 

34  Throughout,  input  argument  “stream"  must  be  an  int  giving  the  desired 

35  stream  number.  The  header  file  rand.h  must  be  included  in  the  calling 

36  program  (iinclude  "pmmlcg.h")  before  using  these  functions. 

37  Usage:  (three  functions) 

38  1.  To  obtain  the  next  U(0,1)  random  number  from  the  stream  "stream," 

39  execute 

40  u  >  pmmlcg_rand(  stream  ) ; 

41  where  pmmlcg.rand  is  a  float  function.  The  float  variable  n  sill 

42  contain  the  next  random  number. 

43  2.  To  set  the  seed  for  the  stream  "stream"  to  a  desired  value  zset, 

44  execute 

45  pmmlcg.randst (  zset,  stream  ); 

46  where  pmmlcg.randst  is  a  void  function  and  zset  must  be  a  long  set 

47  to  the  desired  seed,  a  number  between  1  and  2147483646  (inclusive). 

48  Default  seeds  for  all  100  streams  are  given  in  the  code. 

49  3.  To  get  the  current  (loost  recently  used)  integer  in  the  sequence 

50  being  generated  for  stream  "stream"  into  the  long  variable  zget, 

51  execute 

52  zget  ■  pmmlcg.randgt (  stream  ); 

53  where  pmmlcg.randgt  is  a  long  function. 

64  */ 

55  •inclnde"piimilcg.h" 

56  /*  define  the  constants  */ 

57  «define  MODLUS  2147483647 

58  tdefine  MULTI  24112 

59  tdefine  MULT2  26143 

60  /e  set  the  default  seeds  for  all  100  streams  */ 

61  static  long  zmgU  > 


62 

63 

{  0, 
1973272912, 

281629770, 

20006270, 

1280689831, 

2096730329, 

64 

1933676060, 

913566091, 

246780520, 

1363774876, 

604901985, 

65 

1511192140, 

1259851944, 

824064364, 

150493284, 

242708531, 

66 

76263171, 

1964472944, 

1202299975, 

233217322, 

1911216000, 

67 

726370633, 

403498145, 

993232223, 

1103205531, 

762430696, 

68 

1922803170, 

1385516923, 

76271663, 

413682397, 

726466604, 

69 

336167068, 

1432650381, 

1120463904, 

696778810, 

877722890, 
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70  104657444S,  68911991,  2088367019.  748646416,  622401386, 

71  2122378830,  640690903,  1774806613,  2132646692,  2079249679, 

72  78130110,  862776736,  1187867272.  1361423607,  1646973084, 

73  1997049139,  922610944,  2046612870,  898686771,  243649646, 

74  1004818771,  773686062,  403188473,  372279877,  1901633483, 

75  498067494,  2087769658,  493157916.  597104727,  1630940798, 

76  1814496276,  636444882,  1663163668.  866603736,  67784357, 

77  1432404476,  619691088,  119026696.  880802310.  176192644, 

78  1116780070,  277854671,  1366680350.  1142483975.  2026948661, 

79  1053920743,  786262391,  1792203830,  1494667770.  1923011392, 

80  1433700034,  1244184613,  1147297105,  639712780,  1546929719, 

81  190641742,  1645390429,  264907697,  620389263.  1502074852, 

82  927711160,  364849192,  2049576050,  638580086.  647070247 

83  }; 

84  /*•*••*•*•*•*•••*•*•••••••••••*•••••*•*••*••••••••••••••••••••••••••*/ 

85  float  ingmlcg_rand(  int  atraam  ) 

86  { 

87  long  zi,  lonprd,  lii31: 

88  /*  ganarata  tha  aazt  random  nnmbar  */ 

89  zi  >  zmg  [atraam] ; 

90  lonprd  -  (  zi  ft  65535  )  *  MULTI; 

91  hi31  <■  (  zi  »  16  )  •  MULTI  *  (  lonprd  »  16  ); 

92  zi  «  (  (  lonprd  ft  65635  )  -  MODLUS  )  * 

93  (  (  biSl  ft  32767  )  «  16  )  +  (  hi31  »  IS  ); 

94  if  (  zi  <  0  )  zi  +-  MODLUS; 

96  lonprd  •>  (  zi  ft  65535  )  *  MULT2; 

96  hi31  -  (  zi  »  16  )  *  MULT2  +  (  lonprd  »  16  ) ; 

97  zi  -  (  (  lonprd  ft  65536  )  -  MODLUS  )  * 

98  (  (  liiSl  ft  32767  )  «  16  )  ♦  (  hi31  »  16  ); 

99  if  (  zi  <  0  )  zi  +-  MODLUS; 

100  zmg  [atraam]  zi; 

101  ratnm  (  (  zi  »  7  |  1  )  +  1  )  /  16777216.0; 

102  } 

103  /aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa*****************/ 

104  noid  pmmlcg.randat (  long  zaat,  int  atraam  ) 


105  { 

106  /*  aat  tha  cnrrant  zmg  for  tha  atraam  "atraam"  to  zaat  •/ 

107  zmg[atraaBi]  ■  zaat; 

108  } 

109  /*a***a*a***aa**aa*****aaa******<ia**a***a*****a**aa************a**a**/ 

110  long  pamilcg_randgt  (  int  atraam  ) 

111  { 

112  /*  ratnm  tha  cnrrant  zmg  for  tha  atraam  "atraam”  •/ 

113  ratnm  zmg[atraam] ; 

114  } 

116  /#*aaa**aa*********a*******************a****a******aaa****a**a*******/ 
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Appendix  G.  Code  Style  Guide 


An  important  issue  when  writing  code  in  three  different  object-oriented  DBMSs  is 
to  ensure  that  the  code  is  imderstandable.  This  appendix  describes  the  style  rules  which 
were  followed  during  the  development  of  source  code  for  this  research.  The  purpose  for 
these  style  rules  was  to  provide  consistency  across  the  large  amount  of  source  code  which 
was  written  during  this  research. 

G.l  Code  Indentation  and  Spacing 

All  the  source  code  used  two  space  indentation  between  logical  levels.  No  tabs  were 
used  in  any  of  the  source  code  developed  for  this  research.  Whitespace  in  the  source  code, 
both  vertical  and  horizontal,  was  used  to  make  the  code  easier  to  read.  Figure  72  provides 
a  typical  example  of  good  indentation  and  spacing. 

G.2  Naming  Conventions 

G.2.1  Variable  and  Function  Names.  Variable  and  function  names  contained 
only  small  letters.  Words  in  the  names  were  separated  by  an  imderscore.  The  following 
would  have  been  valid  variable  or  function  names:  start,  microjseconds.  The  following 
would  not  have  been  valid  variable  or  function  names:  Start,  NicroSeconds.  Invalid 
names  were  only  used  if  the  interface  to  an  external  libreuy  required  them.  For  example, 
ObjectStore  uses  the  name  os-Set  for  a  set  when  it  is  parameterized,  and  os_set  when 
it  is  not.  Abbreviations  were  used  in  some  names.  All  the  abbreviations  used  in  our 
development  are  defined  in  Table  91. 

G.2. 2  Constant  Names.  Constant  names  contained  only  capited  letters.  Words 
in  constant  names  were  separated  by  an  imderscore.  Capital  letters  were  also  used  for 
C-f-f-  enumerated  types.  The  following  would  have  been  valid  constant  names:  TRUE, 
TOPJDF-STACK.  The  following  would  not  have  been  valid  constant  names:  True,  top-of  .stack. 
Invalid  names  were  only  used  if  the  interface  to  an  external  library  required  them.  Abbre¬ 
viations  were  used  in  some  constant  names.  All  the  abbreviations  used  in  our  development 
are  defined  in  Table  91. 
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////////////////////////////////////////////////////////////////////// 
void  stopwatch: : start (  ) 

DEBU6.IIIT(  "STOmTCB”,  "stopwatch: : start”  ); 

DEB06.0UT(  "antaring”.  1  ); 

if  (  gattimaofday(  ftatart.tiBS,  (struct  tiMzouo  •)0  )  )  { 

fpriatfC  stdarr.  "ERROH[stopwatch: :start]  failad  call  to  gattiaaofdayC  )\b"  ); 

} 

cloch.miiniiig  -  TRUE; 

DEBUG_0UT(  sprintl(  BUG,  "sacoads  siaca  Jaa  1,  1970:  Xld  microsacoads :  Xld”, 
start.tina.tw.sac,  start .tima.tw.usac  ),  2  ); 

} 

////////////////////////////////////////////////////////////////////// 


Figure  72.  Example  of  Source  Code  Indentation  and  Spacing 


G.S  Comments 

Source  code  comments  were  not  used  to  represent  things  which  are  obvious  from  the 
source  code. 


G.S.  1  Module/Header  Comments.  The  top  of  each  module  (a  “ .  c”  or  “ .  cc”  £Qe) 
and  header  file  (a  “.h”  or  “.hh”  file)  contains  a  comment  block.  The  purpose  of  this  block 
was  to  identify  the  source  file  and  inform  a  reader  of  any  important  information  regarding 
the  entire  module.  All  revisions  made  to  a  source  file  sure  listed  in  the  comment  block. 
Figure  73  shows  an  example  of  a  comment  block  for  a  source  file  which  had  one  revision. 

G.S.2  Inline  Comments.  The  C++  comment  indicator  (//  a  comment  <eol>) 
was  preferred  to  the  C  comment  indicators  (/♦  a  comment  */)  for  inline  comments.  Ob¬ 
viously,  if  the  file  was  designed  to  be  compiled  by  a  C  compiler  this  convention  was  not 
followed. 

G.3.S  Function  Headers  and  Separator  Comments.  Each  function  was  not  given 
a  comment  block  but  a  separator  was  be  used  to  make  it  obvious  to  a  reader  that  a  new 
function  had  been  started.  The  fimction  separator  consisted  of  a  line  of  70  “/”  characters. 
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/• 

•tfif 

•  •  •«#«#  M«t  tMM  i  f  M  »#•#«  mt  «  « 
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•  « 
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9  9 

ttMt 

9 

t  9 

9  9 

9  9 

9  9 

9 

9 

999999 

« 

• 

9  9 

99999 

9  99  9 

999999 

9 

9 

9  9 

t  « 

i 

9  9 

9 

99  99 

9  9 

9 

9  9 

9  9 

tMM 

• 

9999 

9 

9  9 

9  9 

9 

9999 

9  9 

atopvtcik.cc 

Air  Fore*  laatitnt*  of  Toclmology 
TiaotAy  J.  Halloxas. 

20  May  1993 

■OTE:  aany  of  tho  taclmiqnas  asad  in  this  support  packaga  caaa  froB 
tha  "Support. C"  packaga  craatad  by  Caray,  DaVltt,  and  laughtoa 
for  tka  007  banchaark  (Ob j activity  i^plaaaatation) . 

[007  Banchaark  COPTRIQHT  (C)  1993  Caray  DaHitt  laughtoa 
Madison,  HI  O.S.A.  ALL  RIGHTS  RESERTED] 

Ravisions: 

06  Jul  19(  >  -TJH-  Changad  all  tha  dabug  output  to  usa  tha  “dabug.hh"  aacros. 
*/ 


Figure  73.  Source  File  Comment  Block  Example 
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Table  91.  Standard  Abbreviations 


Abbreviation 

Full  Name 

app 

application 

db 

database 

cb 

callback 

conlig 

configuration 

fmt 

foraiat 

id 

identifier 

it 

Itasca 

num 

number 

ma 

Matisse 

max 

mjfcT-iiimiti 

min 

minimum 

misc 

miscellaneous 

msg 

message 

08 

minimum 

pos 

position 

prev 

previous 

proc 

procedure 

aim 

ObjectStore 

8tr 

string 

temp 

ten^orary 

For  a  C  program  a  line  of  characters  inside  a  comment  was  used  (the  total  separator 
line  still  consisted  of  70  characters). 

G.4  Error  Output 

All  error  output  was  standardized.  Error  output  was  broken  into  two  types:  errors 
and  warnings.  The  difference  between  the  two  is  that  a  program  will  exit  if  an  error  occiirs, 
but  continues  when  a  warning  occurs.  For  all  error  output,  the  name  of  the  function  (or 
class  method)  where  the  problem  occurred  appears  inside  brackets  after  the  word  “ERROR” 
or  “WARNING”.  Figure  74  provides  some  examples  of  error  output. 


G.5  Debug  Output 

All  program  debug  output  was  standardized  by  using  the  macros  in  the  file  debug. kh 
(listed  in  the  next  section).  This  macro  package  was  converted  from  a  macro  package  used 
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HAMII6[|Mia]  ta*  rwdo*  is  0 

HAKIIIOCcrssts.cosssctloms]  isvslid  fssry  rssslt 

EUtOKtstopsstcli: :  start]  failsd  call  to  gattiasofday(  ) 

Figure  74.  Examples  of  Program  Error  Output 

[ool: :ool(ool.cc:326)]  aatoriag 
[dies: :dlcs(dica.cc:31)]  aatariag 
[ool : :  f onard.travarsal  (ool .  cc :  128)]  aatariag 
[stopsateh:  :stop«atch(stopwtc)i.cc:80)]  aatariag 
[stopsatcK: ;atart(stopsrteh.ec:37)]  aatariag 
[stopwatch: :atart(otopvtch.cc:44)]  sacoads  siaco 
[dies: : roll (dica.ee: 38)]  aatariag 
[dica: :roll(dica.cc:44)]  rasalt  324 

Jaa  1,  1970:  742061864 

[part: :forvard_travarsal(part.cc:67)]  aatariag 
[part: : forward. travar sal (part.ee: 59)]  part  id 

324  (laval  0) 

[part: :forward_travaraal(part.cc:67)]  aatariag 
[part: :forward_travarsal(part.cc:59)]  part  id 
[part : : f orward.travaraal (part . ec : 67) ]  aatariag 

325  (laval  1) 

[part: :forward_travaraal(part.^c:69)]  part  id 

406  (laval  2) 

Figure  75.  Example  of  Debug  Output 


by  Microsoft  [27].  The  macros  were  intended  to  provide  run-time  tracing  for  programs 
developed  for  the  Microsoft  Windows  environment  and  were  modified  so  they  could  be 
used  for  this  research.  The  advantages  of  the  macro  package  are  the  following:  it  does  not 
require  a  debugger,  it  is  controlled  by  environment  variables,  and  its  debug  code  can  be 
compiled  out  when  debugging  is  finished. 

Debug  output  consists  of  two  parts:  a  location  and  a  message.  The  location  consists 
of  the  function  name  where  the  debug  was  output,  the  name  of  the  source  file  which 
contains  the  function,  and  the  line  number  inside  the  source  file  where  the  debug  output 
line  was  located.  This  information  is  inclosed  in  brackets.  The  message  contains  the  debug 
information  which  is  to  be  output  and  is  allowed  to  be  in  any  format  desired.  Figure  75 
shows  an  example  of  program  debug  output. 


G.5.1  dehug.hh.  The  somrce  code  for  the  file  debug.hh  is  listed  below. 

1  fifndaf  ..OEBVG.HH 

2  id«fia«  __DEBU6_HH 
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3 


/• 

4  MMM 

6  •  i  MM#*  9999$  9  •  MM 

7  •  9  99999  99999  9  9  9 

8t  9  9  9  9  9  9  9  999 

9t  ••  9  9  9  9  9  9 

10  MMM  MMM  MM«  MM  MM 


11  d«bng.h 

12  Air  Fore*  laatitvt*  of  Tocluology 

13  TiiMtliy  J.  Halloru 

14  06  J«1  1M3 


15  lOTE:  til*  baaic  idaaa  in  tbia  fila  caaa  froa  "Dabaggiag  Vitboat  Dabaggara" 

16  Microaoft  Syataau  Joaraal  Vol.  8,  to.  4,  April  1993,  Pagaa  62-55. 

17  SoaM  aaafal  dabaggiag  aaexoa  wbicb  ara  coatrollad  by  aavixoBMat  variablaa. 

18  Raviaioaa : 

19  26  Jal  1993  -TJB-  Cbaagad  tba  DEBOG.IIIT  aacro  to  locally  daclara  tba 

20  "locatioa"  aariabla.  Tbia  aariabla  aaa  baiag  ovamittaa 

21  by  tba  dabag  coda  iaaida  faactioaa  wbicb  wara  callad  froa 

22  iaaida  a  faactioa  aaiag  dabag  oatpat. 

23  •/ 

24  iifdaf  OEBOG 

25  •iBclada<atdio.b> 

26  •iaclada<atdlib.b> 

27  atatic  iat  dabag.lawal  ■  0; 

28  atatic  ebar  B8G[300]:  /*  a  baffar  for  aaa  by  ''aprintf(  )”  •/ 

29  /* 

30  DEBGG.Iirr  (macro) 


31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 


anw  (ebar*)  Coataiaa  tba  aaaa  of  tba  aBwiroBBaat  wariabla 

wbicb  taraa  traciag  oa  or  off  for  tba  faactioa. 
carraat.locatioa  (ebar*)  Applicatioa  dafiaad  locatioa  iafoiaatioa  (aacb 

aa  tba  aaaa  of  tba  carraat  faactioa  baiag 
tracad) . 

*/ 

Mafiaa  DEBUG.IIIK  aaw,  carraat.locatioa  )  \ 


ebar  *locatioa  *  carraat.locatioa;  \ 

(  \ 

if  (  gataav(  aav  )  !•  IGU.  )  \ 

dabag_lawal  *  atoi(  gataaT(  aaw  )  );  \ 

alaa  \ 

dabag_laTal  *0;  \ 

) 


45  /* 

46  DEBGG.Oirr  (macro) 

47  oat  (ebar*)  Tba  aaaaaga  to  ba  oatpat. 

48  lawal  (iat)  Tba  traca  lawal  at  wbicb  tbia  aaaaaga  ia  oatpat. 
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49  •/ 

50  DEBOG_OOT(  ovt.  X*v«l  )  {  \ 

51  if  (  dabvg.lM*!  >■  1«t«1  )  \ 

52  fpriatf(  atd^rr,  ''[Xa(Xa:Xd)]  XaVa**,  locatioa,  \ 

53  ..FILE . Lire..,  oat  )i  \ 

54  } 

55  tala# 

56  /•  co^pila  out  all  tract  iaatructloaa,  if  DEBUG  ia  aot  dtfiatd  •/ 

57  Mtflat  DEBUG.IirrC  tav,  curraat.locatloa  ) 

58  Mafia#  DEBUG.OUTC  out,  laaal  ) 

59  #aadif  DEBUG 

60  iaadif  ..OEBUG.HH 
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Appendix  H.  Benchmark  Source  Code 

The  source  code  for  all  the  benchmark  implementations  created  for  this  research  is 
available  in  AFIT  Technical  Report  AFIT/EN-TR-93-09  [16].  The  source  code  was  not 
included  in  this  thesis  due  to  its  size.  The  technical  report  contains  source  code  for  the 
following  benchmark  implementations: 

•  Itasca  DBMS  implementation  of  the  OOl  benchmark 

•  Matisse  DBMS  implementation  of  the  OOl  bendunark 

•  ObjectStore  DBMS  implementation  of  the  OOl  benchmark 

•  ObjectStore  DBMS  implementation  of  the  Simulation  benchmark 

•  A  non-persistent  implementation  of  the  Simulation  benchmark  (in  the  C-l— i-  pro¬ 
gramming  language) 
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